Python download without supplying a filename

edited after the question was clarified...

urlparse.urlsplit will take the url that you are opening and split it into its component parts, then you can take the path portion and use the last /-delimited chunk as the filename.

import urllib, urlparse

split = urlparse.urlsplit(url)
filename = "/tmp/" + split.path.split("/")[-1]
urllib.urlretrieve(url, filename)

There is urlopen, which creates a file-like object that can be used to read the data without saving it to a local file:

from urllib2 import urlopen

f = urlopen("http://example.com/")
for line in f:
  print len(line)
f.close()

(I'm not really sure if this is what you're asking for.)


The URL you're specifying doesn't refer to a file at all. It's a redirect to a web page, that runs some javascript, that causes your web browser to download the file. The actual address my browser was directed to (a mirror) from the URL in question is:

http://mozilla.mirrors.evolva.ro//firefox/releases/3.6.3/win32/en-US/Firefox%20Setup%203.6.3.exe

I believe that there are two ways that web servers specify the name of the file for downloads;

  1. The final segment of the URL path
  2. The Content-Disposition header, which can specify some other filename to use

For the file you want to download I think you only need the last path segment of the URL (but using the actual URL of the file, not the web page that chooses which mirrored file to use). But for some downloads you'd need to get the filename to use from the Content-Disposition header.


Here is a complete way to do it with python3 and no filename specified in url:

from urllib.request import urlopen
from urllib.request import urlretrieve
import cgi

url = "http://cloud.ine.ru/s/JDbPr6W4QXnXKgo/download"
remotefile = urlopen(url)
blah = remotefile.info()['Content-Disposition']
value, params = cgi.parse_header(blah)
filename = params["filename"]
urlretrieve(url, filename)

In result you should get cargo_live_animals_parrot.jpg file