How to replace all percent-encoded UTF-8 substrings with plain UTF-8 text?

With bash, zsh, GNU echo or some implementations of ksh on some systems, this can be decoded simply by echo -e after replacing all % with \x.

url_encoded_string="%D1%80%D0%B5%D1%81%D1%83%D1%80%D1%81%D1%8B"
temp_string=${url_encoded_string//%/\\x}

printf '%s\n' "$temp_string"
# output: \xD1\x80\xD0\xB5\xD1\x81\xD1\x83\xD1\x80\xD1\x81\xD1\x8B

echo -e "$temp_string"
# output: ресурсы

(It assumes the string itself doesn't contain backslash characters and is not one of the options supported by your echo command)

As @JoshLee also points out, the "echo caveat" can be avoided by directly using:

printf ${url_encoded_string//%/\\x}

instead directly behind the first command.


With perl:

perl -pe 's/%([0-9A-F]{2})/pack"H2",$1/gei'

Or with URI::Escape:

perl -MURI::Escape -pe '$_=uri_unescape$_'