Advertisement · 728 × 90 · Leaderboard AD
SEO BASICS

What is robots.txt?

robots.txt is a plain-text file that lives at the root of your website (e.g. https://yoursite.com/robots.txt) and tells search engine crawlers — Googlebot, Bingbot, and friends — which parts of your site they're allowed to fetch.

Crawl control. Each rule in the file targets a specific bot (or all bots via User-agent: *) and either allows or disallows a URL path. It's the very first file a crawler requests, so any rules you set apply before a single page is indexed.

Blocking pages. Most sites use robots.txt to keep crawlers out of admin dashboards, internal search results, staging URLs, shopping-cart endpoints, and other pages that shouldn't appear in Google. It saves crawl budget and prevents thin-content pages from cluttering the search index.

SEO importance. A well-configured robots.txt focuses crawlers on the pages that do matter, surfaces your sitemap.xml for faster discovery, and protects your rankings by preventing duplicate or low-value URLs from being indexed. A misconfigured one can accidentally hide your entire site — so test before you publish.

Advertisement · Responsive AD
Robots.txt Settings
Configure crawler rules & sitemap
Used to build absolute Sitemap URLs.
Crawler Rules
Toggle to add presets
Custom Rules
Optional
One path per line. Wildcards * and end anchors $ are supported.
Ignored by Googlebot. Honored by Bing, Yandex, & others.
Pre-fills disallow paths for common platforms.
Advertisement · Responsive AD
Robots.txt copied successfully
Save it as /robots.txt at your site root
FAQ

Frequently Asked Questions

Everything you need to know about meta tags, SEO, and getting the most out of this generator.

01 What does robots.txt actually do?

robots.txt is a plain-text file at the root of your site that tells crawlers which URLs they can request. Each rule starts with a User-agent: line (which bot it targets) followed by Allow: or Disallow: directives. Well-behaved crawlers fetch this file before anything else and respect what they find.

02 Where do I upload the robots.txt file?

The file must live at the root of your domain, served from https://yoursite.com/robots.txt. Crawlers do not look in subfolders. For WordPress, drop it in the site root via FTP or your hosting file manager. For Next.js, put it in public/robots.txt. For Shopify or Squarespace, use their built-in robots.txt editor.

03 What's the difference between Disallow and noindex?

Disallow in robots.txt blocks crawling — Google won't fetch the page. But if the page is linked from elsewhere, it can still show up in search results (without a description). To fully remove a page from the index, use the <meta name="robots" content="noindex"> tag and let Google crawl the page so it can see that tag. Don't combine Disallow + noindex — Google won't be able to read the noindex directive.

04 Can robots.txt block AI training bots?

Yes — enable the "Block AI training bots" toggle above and we'll add disallow rules for GPTBot, ClaudeBot, Google-Extended, CCBot, and PerplexityBot. This stops OpenAI, Anthropic, Google's AI training pipeline, Common Crawl, and Perplexity from using your site for model training. Note that less-scrupulous scrapers ignore robots.txt entirely — server-side blocking is needed for those.

05 Do I need a Sitemap line?

Not strictly required, but highly recommended. A Sitemap: directive in robots.txt tells crawlers exactly where to find your sitemap.xml, which speeds up discovery of new and updated pages. Use the full absolute URL: Sitemap: https://yoursite.com/sitemap.xml. You can include multiple sitemap lines if you have several.

06 How do I test my robots.txt?

Use Google Search Console's robots.txt Tester — it lets you paste your file and check whether any specific URL is allowed or blocked for Googlebot, smartphones, image bots, and so on. Always test before publishing: a stray Disallow: / can de-index your entire site overnight. Bing Webmaster Tools also offers a tester for Bingbot.

Need a more advanced rule? Ask our SEO team →