wget and encoded URL

As this is annoyingly so common, there are various converters available - e.g. this site. You can use these to decode the URL - so it will convert this:

http%3A%2F%2Fdl.minitoons.ir%2Flongs%2FKhumba%20(2013)%20%5BEN%5D%20%5BBR-Rip%20720p%5D%20-%20%5Bwww.minitoons.ir%5D.rar

to:

http://dl.minitoons.ir/longs/Khumba (2013) [EN] [BR-Rip 720p] - [www.minitoons.ir].rar

It would be niCe to have a command line version though...

EDIT:

Found a command line version - basically:

echo "http%3A%2F%2F-REST-OF-URL" | sed -e's/%\([0-9A-F][0-9A-F]\)/\\\\\x\1/g' | xargs echo -e

This can be implemented in a script like this to decode the URL:

#!/bin/bash
echo "$@" | sed -e's/%\([0-9A-F][0-9A-F]\)/\\\\\x\1/g' | xargs echo -e
exit

which if saved and made executable, works quite nicely.

also this script, which will download the UL as well:

#!/bin/bash
echo "$@" | sed -e's/%\([0-9A-F][0-9A-F]\)/\\\\\x\1/g' | xargs echo -e | wget -c -i -
exit

N.B. I think the case the URL is in is not important for most sites - e.g. HTTP://WWW.UBUNTU.COM


You should use it like this

wget "http://dl.minitoons.ir/longs/Khumba%20(2013)%20[EN]%20[BR-Rip%20720p]%20-%20[www.minitoons.ir].rar"`

Just replace every space with %20 . Or Better copy your original link and paste it in Chromium Browser address bar. It will automatically format it for you. Now copy it from there to your terminal.


Wget expects the URL to have the following format:

[protocol://]host/path

The protocol is optional. In absence of protocol, Wget assumes HTTP.

Wget accepts percent-encoded URLs just fine, but the delimiters between protocol, host and path cannot be percent-encoded.

This is also why Wget changed the casing of the URL. Since it didn't find a single unencoded slash, it assumes that

http://dl.minitoons.ir/longs/khumba (2013) [en] [br-rip 720p] - [www.minitoons.ir].rar

is the hostname (which would be case-insensitive). The actual hostname is, of course, dl.minitoons.ir.

For an automatic solution, substituting %3A%2F%2F and the %2F after the hostname by :// and / would suffice, but it's just as easy to decode the URL at one. @Wilf already gave a good solution for this.

However, if you're going to type the Wget command manually, just do this:

wget "dl.minitoons.ir/longs%2FKhumba%20(2013)%20%5BEN%5D%20%5BBR-Rip%20720p%5D%20-%20%5Bwww.minitoons.ir%5D.rar"

Tags:

Wget