What does "Disallow: /search" mean in robots.txt?

In the Disallow field you specify the beginning of URL paths of URLs that should be blocked.

So if you have Disallow: /, it blocks everything, as every URL path starts with /.

If you have Disallow: /a, it blocks all URLs whose paths begin with /a. That could be /a.html, /a/b/c/hello, or /about.

In the same sense, if you have Disallow: /search, it blocks all URLs whose paths begin with the string /search. So it would block the following URLs, for example (if the robots.txt is in http://example.com/):

  • http://example.com/search
  • http://example.com/search.html
  • http://example.com/searchengine
  • http://example.com/search/
  • http://example.com/search/index.html

While the following URLs would still be allowed:

  • http://example.com/foo/search
  • http://example.com/sea

Note that robots.txt doesn’t know/bother if the string matches a directory, a file or nothing at all. It only looks at the characters in the URL.


Other answers explain how robots.txt is processed to apply this rule, but don't address why you would want to disallow bots from crawling your search results.

One reason might be that your search results are expensive to generate. Telling bots not to crawl those pages could reduce load on your servers.

Search results pages are also not great landing pages. A search result page typically just has a list of 10 pages from your site with titles and descriptions. Users would generally be better served by going directly to the most relevant of those pages. In fact, Google has said that they don't want your site search results indexed by Google. If you don't disallow them, Google could penalize your site.