Case-Mapping
Introduction
Case-mapping or case conversion is performed everytime a character is changed from upper case to lower case, or from lower case to upper case. Case converter performs case-mapping everytime you use it.
There are two kind of case-mapping:
- Simple case-mapping
- Full case-mapping
Simple case-mapping is one-to-one character mapping, for example a single
character "A
" is replaced with another single character "a
".
As you can image, Full case-mapping performs one-to-many character
replacements (more precisely one-to-many code-points).
In real world use-cases, it's rare to perform full case-mapping, this is
because it only concerns a very small set of characters. For example in german
language, the letter "ß
" is strictly lowercase and should be mapped to "SS
"
in uppercase words.
Case-Converter behaviour
By default, Case-Converter will perform full case-mapping
// Full case-mapping
$ger = new Convert('Straße');
echo $ger->toUpper(); // output: STRASSE
If you want to perform simple case-mapping then you have to
call ->forceSimpleCaseMapping()
:
// Simple case-mapping
$ger = new Convert('Straße');
$ger->forceSimpleCaseMapping();
echo $ger->toUpper(); // output: STRAßE
As you can see, in full case-mapping string length can change.
Case-Mapping in PHP
PHP 7.3 introduced full case-mapping, you can have one-to-many character mapping. In practice this means than you can have different results depending on your PHP version.
Internally Case-Converter uses mb_convert_case() . This function works in conjunction with specific constants to tell what action to perform. For example:
mb_convert_case('Foo', MB_CASE_UPPER); // FOO
Prior to PHP 7.3, these were the available constants and their use:
Constant | Meaning |
---|---|
MB_CASE_UPPER | Performs simple upper-case fold conversion. |
MB_CASE_LOWER | Performs simple lower-case fold conversion. |
MB_CASE_TITLE | Performs simple title-case fold conversion. |
But from PHP 7.3, new constants were added and their meaning changed:
Constant | Meaning |
---|---|
MB_CASE_UPPER | Performs a full upper-case folding. |
MB_CASE_LOWER | Performs a full lower-case folding. |
MB_CASE_TITLE | Performs a full title-case conversion. |
MB_CASE_UPPER_SIMPLE | Performs simple upper-case fold conversion. |
MB_CASE_LOWER_SIMPLE | Performs simple lower-case fold conversion. |
MB_CASE_TITLE_SIMPLE | Performs simple title-case fold conversion. |
Locale dependent mapping
Some case-mapping are locale dependent. This is the case of Turkish where the
small letter "i
" should be replaced by a capital letter with a dot "İ
".
However, according to documentation:
Only unconditional, language agnostic full case-mapping is performed.
This means that locale dependent mapping are ignored and not performed.
Resources
- PHP 7.3 Full Case-Mapping and Case-Folding Support
- https://www.php.net/manual/en/migration73.new-features.php#migration73.new-features.mbstring.case-mapping-folding
- mb_convert_case()
- https://www.php.net/manual/en/function.mb-convert-case.php
- mbstring constant
- https://www.php.net/manual/en/mbstring.constants.php