What determines the frequency the Wayback Machine crawls one's website?

The Wayback Machine archive is a combination of data from a large number of different crawls:

  • Alexa crawls, which appear after a 6 month delay
  • Our own crawls, which are seeded from the Alexa top million list and others
  • ArchiveTeam crawls, done by volunteers
  • ArchiveIt crawls, done by our 400+ partners, mostly libraries, many of which allow their data to be included in the general Wayback Machine

We have an experimental Wayback Machine search and explore interface at https://web-beta.archive.org/ which makes visible why each capture was made.