What robots.txt does
Robots.txt is the first file most search engine crawlers request when they visit your site. It lives at your domain root and tells crawlers which paths they may or may not access. Used well, it keeps bots out of admin areas, internal search pages, and other low-value URLs so your crawl budget goes to the pages that matter.
A crucial nuance: disallowing a path stops crawling, not indexing. A blocked URL can still appear in Google if other pages link to it. To keep a page out of the index entirely, use a noindex meta tag (which you can build with our meta tag generator) rather than relying on robots.txt.
Common robots.txt examples
Most sites start from one of a few standard patterns:
Allow all crawlers (the default - every page is crawlable):
User-agent: *
Allow: /Block all crawlers (useful for staging sites):
User-agent: *
Disallow: /WordPress (block the admin area but keep admin-ajax, and declare the sitemap):
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://example.com/sitemap.xmlThe presets above load these for you instantly so you can tweak rather than type from scratch.
"Indexed, though blocked by robots.txt" explained
This common Google Search Console warning trips a lot of people up. It happens because Disallow only stops Google from crawling a page - it does not stop Google from indexing the URL if other pages link to it. The result is a URL in the index with no description, because Google was never allowed to read it.
If you want a page fully out of the index, do not block it in robots.txt. Instead allow crawling and add anoindex meta tag (build one with our meta tag generator) so Google can read the instruction and drop the page.
How to use this tool
Start from a preset - allow everything, block everything, or a sensible WordPress default - then adjust the user-agent, allow and disallow paths, crawl-delay, and sitemap declaration. Copy or download the finished file and upload it to your site root.
Pair it with a sitemap
Declaring your sitemap inside robots.txt helps crawlers discover your pages faster. Generate one with our XML sitemap generator, then let Soro keep your content growing on autopilot.