Get first character of UTF-8 string

PHP strings doesn't understand multibyte strings by default, the array like indexing will chop of the first byte and if that happen not to be in the ascii range you get this result.

Use mb_substr method.


As previously mentioned in other questions, with PHP, when attempting to get a substring, it doesn't understand multibyte characters (as you get with UTF8 for example).

What the other answers don't mention is that you should hint the encoding you would like to use for the mb_substr

So, for example, I use this:

 mb_substr( "Sunday", 0, 1,'UTF8'); // Returns S
 mb_substr( "воскресенье", 0, 1,'UTF8'); // Returns в

There are several things you need to consider:

  1. Check that data in the DB is being stored as UTF-8
  2. Check that the client connection to the DB is in UTF-8 (for example, in mysql see: http://www.php.net/manual/en/mysqli.character-set-name.php)
  3. Make sure that the page has it's content-type set as UTF-8 [you can use header('Content-Type: utf-8'); ]
  4. Try setting the internal encoding, using mb_internal_encoding("UTF-8");
  5. Use mb_substr instead of array index notation

$first_char = mb_substr($title, 0, 1);

You need to use PHP's multibyte string functions to properly handle Unicode strings:

http://www.php.net/manual/en/ref.mbstring.php

http://www.php.net/manual/en/function.mb-substr.php

You'll also need to specify the character encoding in the <head> of your HTML:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

or:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-16" />

Tags:

Php

Yii