SEO Utilities

Free Robots.txt Generator

Create a valid robots.txt in seconds. Choose what crawlers can access, set crawl-delay, declare your sitemap, then copy or download the file.

User-agent: *
Allow: /

What robots.txt does

Robots.txt is the first file most search engine crawlers request when they visit your site. It lives at your domain root and tells crawlers which paths they may or may not access. Used well, it keeps bots out of admin areas, internal search pages, and other low-value URLs so your crawl budget goes to the pages that matter.

A crucial nuance: disallowing a path stops crawling, not indexing. A blocked URL can still appear in Google if other pages link to it. To keep a page out of the index entirely, use a noindex meta tag (which you can build with our meta tag generator) rather than relying on robots.txt.

Common robots.txt examples

Most sites start from one of a few standard patterns:

Allow all crawlers (the default - every page is crawlable):

User-agent: *
Allow: /

Block all crawlers (useful for staging sites):

User-agent: *
Disallow: /

WordPress (block the admin area but keep admin-ajax, and declare the sitemap):

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://example.com/sitemap.xml

The presets above load these for you instantly so you can tweak rather than type from scratch.

"Indexed, though blocked by robots.txt" explained

This common Google Search Console warning trips a lot of people up. It happens because Disallow only stops Google from crawling a page - it does not stop Google from indexing the URL if other pages link to it. The result is a URL in the index with no description, because Google was never allowed to read it.

If you want a page fully out of the index, do not block it in robots.txt. Instead allow crawling and add anoindex meta tag (build one with our meta tag generator) so Google can read the instruction and drop the page.

How to use this tool

Start from a preset - allow everything, block everything, or a sensible WordPress default - then adjust the user-agent, allow and disallow paths, crawl-delay, and sitemap declaration. Copy or download the finished file and upload it to your site root.

Pair it with a sitemap

Declaring your sitemap inside robots.txt helps crawlers discover your pages faster. Generate one with our XML sitemap generator, then let Soro keep your content growing on autopilot.

Frequently asked questions

What is a robots.txt file?

Robots.txt is a plain-text file at the root of your site that tells search engine crawlers which URLs they can or cannot request. It is the first file most crawlers check.

Where do I put the robots.txt file?

It must live at the root of your domain, for example https://example.com/robots.txt. Crawlers will not find it in a subfolder.

Does robots.txt stop a page from being indexed?

No. Disallow blocks crawling, not indexing - a blocked URL can still appear in results if linked elsewhere. To prevent indexing, use a noindex meta tag instead.

How do I block all crawlers with robots.txt?

Use "User-agent: *" followed by "Disallow: /". This asks every compliant crawler to avoid the whole site - handy for staging environments. Remember it is a request, not a hard block, and it does not guarantee de-indexing.

Why does Google say "Indexed, though blocked by robots.txt"?

Because Disallow only prevents crawling, not indexing. If other pages link to a blocked URL, Google may index it without a description. To remove it, allow crawling and add a noindex meta tag instead.

Related free tools

Browse all free tools