Trim unicode whitespace in PHP 5.2

PCRE unicode properties properties can be used to achieve this

Here is the code that I played with and seems to do what you want:

<?php
function unicode_trim ($str) {
    return preg_replace('/^[\pZ\pC]+([\PZ\PC]*)[\pZ\pC]+$/u', '$1', $str);
}

$key = chr(0xc2) . chr(0xa0) . '#page#' . chr(0xc2) . chr(0xa0);

var_dump(unicode_trim($key));

Result

[~]> php e.php
string(6) "#page#"

Explanation:

\p{xx} a character with the xx property \P{xx} a character without the xx property

If xx has only one character, then {} can be dropped, e.g. \p{Z} is the same as \pZ

Z stands for all separators, C stands for all "other" characters (for example control characters)


preg_replace('/^[\pZ\pC]+|[\pZ\pC]+$/u','',$str);