PHP, str_pad unicode issue

You can also use str_repeat.

function mb_str_pad($string, $length, $pad_string = " ")
{
    return $string . str_repeat($pad_string, $length - mb_strlen($string));
}

You can upgrade this function by adding STR_PAD_RIGHT, STR_PAD_LEFT or STR_PAD_BOTH;


You need a multibyte version of str_pad(), like below. It's inspired by the source code of str_pad().

function mb_str_pad($input, $pad_length, $pad_string = ' ', $pad_type = STR_PAD_RIGHT, $encoding = 'UTF-8')
{
    $input_length = mb_strlen($input, $encoding);
    $pad_string_length = mb_strlen($pad_string, $encoding);

    if ($pad_length <= 0 || ($pad_length - $input_length) <= 0) {
        return $input;
    }

    $num_pad_chars = $pad_length - $input_length;

    switch ($pad_type) {
        case STR_PAD_RIGHT:
            $left_pad = 0;
            $right_pad = $num_pad_chars;
            break;

        case STR_PAD_LEFT:
            $left_pad = $num_pad_chars;
            $right_pad = 0;
            break;

        case STR_PAD_BOTH:
            $left_pad = floor($num_pad_chars / 2);
            $right_pad = $num_pad_chars - $left_pad;
            break;
    }

    $result = '';
    for ($i = 0; $i < $left_pad; ++$i) {
        $result .= mb_substr($pad_string, $i % $pad_string_length, 1, $encoding);
    }
    $result .= $input;
    for ($i = 0; $i < $right_pad; ++$i) {
        $result .= mb_substr($pad_string, $i % $pad_string_length, 1, $encoding);
    }

    return $result;
}


$str = "nü";
$pad = "ü";

echo mb_str_pad($str, 5, $pad);

I think you need to look inside more php.net (here: http://php.net/str_pad#111147). But I changed it a bit.

Note: Don't forget to call this before mb_internal_encoding("utf-8");.

mb_internal_encoding("utf-8");

function str_pad_unicode($str, $pad_len, $pad_str = ' ', $dir = STR_PAD_RIGHT) {
    $str_len = mb_strlen($str);
    $pad_str_len = mb_strlen($pad_str);
    if (!$str_len && ($dir == STR_PAD_RIGHT || $dir == STR_PAD_LEFT)) {
        $str_len = 1; // @debug
    }
    if (!$pad_len || !$pad_str_len || $pad_len <= $str_len) {
        return $str;
    }

    $result = null;
    if ($dir == STR_PAD_BOTH) {
        $length = ($pad_len - $str_len) / 2;
        $repeat = ceil($length / $pad_str_len);
        $result = mb_substr(str_repeat($pad_str, $repeat), 0, floor($length))
                . $str
                . mb_substr(str_repeat($pad_str, $repeat), 0, ceil($length));
    } else {
        $repeat = ceil($str_len - $pad_str_len + $pad_len);
        if ($dir == STR_PAD_RIGHT) {
            $result = $str . str_repeat($pad_str, $repeat);
            $result = mb_substr($result, 0, $pad_len);
        } else if ($dir == STR_PAD_LEFT) {
            $result = str_repeat($pad_str, $repeat);
            $result = mb_substr($result, 0, 
                        $pad_len - (($str_len - $pad_str_len) + $pad_str_len))
                    . $str;
        }
    }

    return $result;
}

$t = STR_PAD_LEFT;
$s = '...';
$as = 'AO';
$ms = 'ÄÖ';
echo "<pre>\n";
for ($i = 3; $i <= 1000; $i++) {
    $s1 = str_pad($s, $i, $as, $t); // can not inculde unicode char!!!
    $s2 = str_pad_unicode($s, $i, $ms, $t);
    $l1 = strlen($s1);
    $l2 = mb_strlen($s2);
    echo "len $l1: $s1 \n";
    echo "len $l2: $s2 \n";
    echo "\n";
    if ($l1 != $l2) die("Fail!");
}
echo "</pre>";

Test here: http://codepad.viper-7.com/3jTEgt