How to url-encode only non-ASCII symbols of URL in PHP, but leave reserved symbols un-encoded?

I have a simple one-liner that I use to do in-place encoding only on non-ASCII characters using preg_match_callback:

preg_replace_callback('/[^\x20-\x7f]/', function($match) {
    return urlencode($match[0]);
}, $url);

Note that the anonymous function is only supported in PHP 5.3+.


After researching a bit, I came to a conclusion that there's no way to do nicely in PHP (however, other languages like python / perl do seem to have functions exactly for this use case). This is the function I came up with (ensures encoding of path fragment of the URL):

function url_path_encode($url) {
    $path = parse_url($url, PHP_URL_PATH);
    if (strpos($path,'%') !== false) return $url; //avoid double encoding
    else {
        $encoded_path = array_map('urlencode', explode('/', $path));
        return str_replace($path, implode('/', $encoded_path), $url);
    }   
}

I think this will do what you want.

<?php

$string = 'http://tinklarastis.omnitel.lt/kokius-aptarnavimo-kanalus-klientui-siulo-„omnitel“-1494/?foo=bar&fizz=buzz';

var_dump(filter_var($string, FILTER_SANITIZE_STRING, FILTER_FLAG_ENCODE_HIGH));

This will get you:

$ php test.php
string(140) "http://tinklarastis.omnitel.lt/kokius-aptarnavimo-kanalus-klientui-siulo-&#226;&#128;&#158;omnitel&#226;&#128;&#156;-1494/?foo=bar&fizz=buzz"