Does the user agent in any regular browser contain 'bot' or 'crawl'?

According to the list at http://www.useragentstring.com/pages/useragentstring.php?typ=Browser with over 9000 user agent strings from various browsers:

  • 0 user agent strings of browsers contains the word "bot"
  • 2 user agent strings of browsers contains the word "crawl"
  • 0 user agent strings of browsers contains the word "spider"

(The 2 which contains "crawl" is the following: "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; YComp 5.0.2.6; MSIECrawler)" and "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; MSIECrawler)" I think it is safe to not consider those.)

According to the list at http://www.useragentstring.com/pages/useragentstring.php?typ=Crawler with 442 user agent strings listed as bots:

  • 208 user agent strings of bots contains the word "bot"
  • 63 user agent strings of bots contains the word "crawl"
  • 37 user agent strings of bots contains the word "spider"
  • 282 user agent strings of bots contains either "bot", "crawl" or "spider"

My conclusion: it is safe to filter bots by user agent strings by the words "bot", "crawl" and "spider". It's not bullet-proof but is definitely better than nothing.

Note: When searching for the keywords I used case insensitive searching.


A better solution IMO would be to detect whether the user is logged in. If they are not, show the standard page (this could be cached). Any web spider will never be logged in but if you are optimizing for them, why not for new users to your site?

Tags:

User Agent