How to prevent Googlebot from overwhelming site?

  • register at google webmaster tools, verify your site and throttle google bot down
  • submit a sitemap
  • read the google guildelines: (if-Modified-Since HTTP header)
  • use robot.txt to restrict access from to bot to some parts of the website
  • make a script that changes the robot.txt each $[period of time] to make sure the bot is never able to crawl too many pages at the same time while making sure it can crawl all the content overall

You can set how your site is crawled using google's webmaster tools. Specifically take a look at this page: Changing Google's crawl rate

You can also restrict the pages that the google bot searches using a robots.txt file. There is a setting available for crawl-delay, but it appears that it is not honored by google.


Register your site using the Google Webmaster Tools, which lets you set how often and how many requests per second googlebot should try to index your site. Google Webmaster Tools can also help you create a robots.txt file to reduce the load on your site