Question 1

What is a robots.txt file?

Accepted Answer

Robots.txt is a plain-text file at the root of your site that tells search engine crawlers which URLs they can or cannot request. It is the first file most crawlers check.

Question 2

Where do I put the robots.txt file?

Accepted Answer

It must live at the root of your domain, for example https://example.com/robots.txt. Crawlers will not find it in a subfolder.

Question 3

Does robots.txt stop a page from being indexed?

Accepted Answer

No. Disallow blocks crawling, not indexing - a blocked URL can still appear in results if linked elsewhere. To prevent indexing, use a noindex meta tag instead.

Question 4

How do I block all crawlers with robots.txt?

Accepted Answer

Use "User-agent: *" followed by "Disallow: /". This asks every compliant crawler to avoid the whole site - handy for staging environments. Remember it is a request, not a hard block, and it does not guarantee de-indexing.

Question 5

Why does Google say "Indexed, though blocked by robots.txt"?

Accepted Answer

Because Disallow only prevents crawling, not indexing. If other pages link to a blocked URL, Google may index it without a description. To remove it, allow crawling and add a noindex meta tag instead.

Free Robots.txt Generator

What robots.txt does

Common robots.txt examples

"Indexed, though blocked by robots.txt" explained

How to use this tool

Pair it with a sitemap

Frequently asked questions

Related free tools