Changing hostname in a url

Using urlparse and urlunparse methods of urlparse module:

import urlparse

old_url = 'https://www.google.dk:80/barbaz'
url_lst = list(urlparse.urlparse(old_url))
# Now url_lst is ['https', 'www.google.dk:80', '/barbaz', '', '', '']
url_lst[1] = 'www.foo.dk:80'
# Now url_lst is ['https', 'www.foo.dk:80', '/barbaz', '', '', '']
new_url = urlparse.urlunparse(url_lst)

print(old_url)
print(new_url)

Output:

https://www.google.dk:80/barbaz
https://www.foo.dk:80/barbaz

You can use urllib.parse.urlparse function and ParseResult._replace method (Python 3):

>>> import urllib.parse
>>> parsed = urllib.parse.urlparse("https://www.google.dk:80/barbaz")
>>> replaced = parsed._replace(netloc="www.foo.dk:80")
>>> print(replaced)
ParseResult(scheme='https', netloc='www.foo.dk:80', path='/barbaz', params='', query='', fragment='')

If you're using Python 2, then replace urllib.parse with urlparse.

ParseResult is a subclass of namedtuple and _replace is a namedtuple method that:

returns a new instance of the named tuple replacing specified fields with new values

UPDATE:

As @2rs2ts said in the comment netloc attribute includes a port number.

Good news: ParseResult has hostname and port attributes. Bad news: hostname and port are not the members of namedtuple, they're dynamic properties and you can't do parsed._replace(hostname="www.foo.dk"). It'll throw an exception.

If you don't want to split on : and your url always has a port number and doesn't have username and password (that's urls like "https://username:[email protected]:80/barbaz") you can do:

parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port))

You can take advantage of urlsplit and urlunsplit from Python's urlparse:

>>> from urlparse import urlsplit, urlunsplit
>>> url = list(urlsplit('https://www.google.dk:80/barbaz'))
>>> url
['https', 'www.google.dk:80', '/barbaz', '', '']
>>> url[1] = 'www.foo.dk:80'
>>> new_url = urlunsplit(url)
>>> new_url
'https://www.foo.dk:80/barbaz'

As the docs state, the argument passed to urlunsplit() "can be any five-item iterable", so the above code works as expected.

Tags:

Python

Url