How to get a perfect local copy of a web page?

The reason "Lots of JavaScript & such seems to trip it up" is probably that so many companies use content management systems (Joomla, Drupal and Wordpress) , which use those to query databases for content. If that is the case, you will not get the whole page like you want. So it depends on the web page.


Try downloading the website using HTTrack. The options allow you to configure how the locally downloaded files will be linked, and what exactly is downloaded. Windows, Linux, and Mac builds are available.


You need to download the entire website with Httrack (you need to set it so it doesn't download external JavaScripts)... just run it, then see the directories which downloaded, run Httrack again and exclude (f.e. -.googlesyndication.com/* -.facebook.net/ -*.google-analytics.com/** etc.)

You can also use Wget:

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://www.yourdomain.com

When you are done you still need to rewrite all the links so they don't point at .../index.html. This solves Dynamic to Static HTML Convertor.

Tags:

Web