An example of using this to find all the code points in a string that cannot be transliterated to Latin-ASCII:
<?php
$string = "Народm, Intl gurus get paid €10000/hr 😁";
$latinAscii = Transliterator::create('NFC; Any-Latin; Latin-ASCII;');
$transliterated = $latinAscii->transliterate($string);
$codePoints = IntlBreakIterator::createCodePointInstance();
$codePoints->setText($transliterated);
foreach ($codePoints->getPartsIterator() as $char) {
$ord = IntlChar::ord($char);
if (255 < $ord) {
echo IntlChar::charName($ord) . "\n";
}
}
?>
Outputs:
EURO SIGN
GRINNING FACE WITH SMILING EYESThe IntlCodePointBreakIterator class
(PHP 5 >= 5.5.0, PHP 7, PHP 8)
Introduction
This break iterator identifies the boundaries between UTF-8 code points.
Class synopsis
/* Inherited constants */
/* Methods */
/* Inherited methods */
public static function IntlBreakIterator::createCharacterInstance(?string
$locale = null): ?IntlBreakIteratorpublic static function IntlBreakIterator::createLineInstance(?string
$locale = null): ?IntlBreakIteratorpublic static function IntlBreakIterator::createSentenceInstance(?string
$locale = null): ?IntlBreakIteratorpublic static function IntlBreakIterator::createTitleInstance(?string
$locale = null): ?IntlBreakIteratorpublic static function IntlBreakIterator::createWordInstance(?string
$locale = null): ?IntlBreakIteratorpublic function IntlBreakIterator::getPartsIterator(string
}$type = IntlPartsIterator::KEY_SEQUENTIAL): IntlPartsIteratorTable of Contents
- IntlCodePointBreakIterator::getLastCodePoint — Get last code point passed over after advancing or receding the iterator
+add a note
User Contributed Notes 1 note
Matt Kynx ¶
3 years ago