Free Tool · No Signup · 18 AI Bot Controls

Free Robots.txt Generator — Beginner Friendly, Pro Powerful

Build a perfect robots.txt file in seconds. Tick what to allow or block — including 18 AI crawlers like GPTBot, ClaudeBot and PerplexityBot. Includes platform install guides for WordPress, Shopify, Webflow, Astro, Next.js and more.

1,500+ Businesses helped 100% free Cost None Signup required

Step 1 — Pick a starting point

Quick Start Presets

Each preset applies a sensible set of rules to every section below. You can fine-tune anything afterwards — picking a preset is a starting point, not a commitment.

How to read this builder

  • Green dot — we recommend a specific action for most sites (block or allow). Hover the chip to see which.
  • Amber dot — optional. Depends on your situation. No strong default either way.
  • "?" button on each chip — opens a panel with the full reasons to block vs allow that specific bot or path.

Standard Search Engines

Default: All allowed

These are the search engines that drive your traffic. Leave them all ticked unless you have a specific reason to block one.

Recommended approach: Allow all 8 by default. Only block a search engine if you serve a specific market and have zero traffic from elsewhere (e.g. UK-only B2B sites can safely block Baiduspider).

AI Crawlers

Tick to block

Tick a bot to block it from training on or scraping your content. Hover any chip for what that specific bot does. Use the master toggle below for maximum protection in one click.

Recommended approach (the middle path): Block training crawlers (GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider) to protect IP, but allow live-search crawlers (ChatGPT-User, Perplexity-User, OAI-SearchBot) so you stay visible in AI answers. Tap any "?" to see the pros / cons for that specific bot.

Common Paths to Block

Tick the URL patterns you want to block from crawling. The preset above has already ticked the most common ones for your chosen starting point.

CRITICAL: Do NOT block /wp-content/plugins/ or /wp-includes/ on WordPress sites — many themes load CSS / JS from there, and blocking breaks Google's ability to render your pages. Tap each chip's "?" for the full reasoning.

Admin & CMS

Commerce

Content archives

Query parameters

File types

System

Crawl Delay (advanced — optional)

Most modern bots ignore crawl-delay. Only set this if your server is genuinely struggling under crawl load.

Sitemap URLs

Tell crawlers where to find your sitemap. We strongly recommend adding at least one — this is how Google discovers every important page on your site.

The basics

What is robots.txt?

What it is

robots.txt is a plain text file that lives at the root of your website (yourdomain.com/robots.txt). It tells search-engine crawlers and other bots which parts of your site they can and cannot access. Created in 1994, it is the oldest and most widely respected protocol for controlling crawler behaviour, and every major search engine plus most AI companies honour it.

What it does (and doesn’t do)

  • Tells well-behaved bots which URLs to skip
  • Conserves your crawl budget
  • Reduces server load
  • Hides admin areas from search results
  • Controls AI training data access
  • Does NOT prevent indexing (use a noindex meta tag for that)
  • Does NOT block malicious bots (they ignore it)
  • Does NOT secure private content (use authentication)
  • Does NOT remove already-indexed pages from Google

Common mistakes

  • Blocking CSS or JS files (hurts rankings — Google needs to render your page)
  • Using robots.txt to hide private content (use authentication instead)
  • Typos in user-agent names (case-sensitive in practice)
  • Missing the file entirely (recommended even if empty)
  • Conflicting Allow / Disallow rules
  • Forgetting to update after a site migration

Step 3 — Install it

Platform Deployment Guides

Exact step-by-step for every major platform. Pick yours, follow the steps, verify at yourdomain.com/robots.txt.

WordPress

WordPress sites have three good options depending on which SEO plugin you run. Pick one method — do not combine them.

Method 1 — Yoast SEO plugin

  1. WordPress Admin → Yoast SEO → Tools
  2. Click File editor
  3. If no robots.txt exists, click Create robots.txt file
  4. Paste the generated content from the preview panel above
  5. Click Save changes to robots.txt and verify at yourdomain.com/robots.txt

Method 2 — Rank Math plugin

  1. WordPress Admin → Rank Math → General Settings
  2. Click Edit robots.txt in the sidebar
  3. Paste the generated content
  4. Click Save Changes

Method 3 — Manual via FTP / cPanel

  1. Click Download robots.txt in the preview panel above
  2. Connect via FTP or cPanel File Manager
  3. Upload to the WordPress root folder (same level as wp-config.php)
  4. Verify at yourdomain.com/robots.txt
WordPress generates a virtual robots.txt by default. As soon as you upload a real file, the virtual one is overridden — there is no setting to toggle.

Frequently Asked Questions

The questions we hear most from clients setting up robots.txt for the first time.

Where does robots.txt go on my server?

At the root of your domain, accessible at yourdomain.com/robots.txt. The filename must be lowercase and named exactly robots.txt — Google will not look for it anywhere else.

Does robots.txt actually block bots?

Only bots that choose to honour it. All major search engines (Google, Bing, DuckDuckGo) and most reputable AI companies (OpenAI, Anthropic, Perplexity) respect it. Malicious scrapers, spam bots and most copy-scrapers ignore it entirely — for those, use authentication, rate limiting or a WAF rule.

Is robots.txt case-sensitive?

The filename must be lowercase: robots.txt. URL paths inside the file ARE case-sensitive (/Admin/ and /admin/ are treated as different paths). User-agent names are NOT case-sensitive, but it is conventional to write them in their canonical case (Googlebot, GPTBot, ClaudeBot) for readability.

How long until Google sees my robots.txt changes?

Googlebot typically re-checks robots.txt every 24 hours. To speed it up, submit your sitemap in Google Search Console or use the URL Inspection tool on any page — the request triggers a fresh fetch of robots.txt as a side effect.

What is the difference between robots.txt and a noindex meta tag?

robots.txt blocks crawling (the bot never fetches the page). A noindex meta tag blocks indexing (the bot fetches the page but is told not to add it to search results). To remove a page from Google, use noindex — robots.txt alone will NOT remove an already-indexed page, because Google cannot fetch the page to see the noindex.

What is the difference between robots.txt and .htaccess?

robots.txt is a polite request to bots — it relies on the bot choosing to obey. .htaccess (or your Nginx config) is server-level enforcement — it can actually block requests, redirect them, or require authentication. Use robots.txt for crawl-budget control and use .htaccess / Nginx rules for genuine access control.

Why should I block AI crawlers?

To prevent your content being used to train AI models without compensation, to protect proprietary content, or to control how your brand is represented in AI-generated answers. The middle path many businesses now take: block training bots (GPTBot, ClaudeBot, Google-Extended, CCBot) but allow live-search bots (ChatGPT-User, Perplexity-User, OAI-SearchBot) so you still appear in AI answers.

Should I block AI crawlers or allow them?

Allow them if you want visibility in AI-generated answers — a fast-growing traffic source. Block them if you want to protect proprietary content from being absorbed into training datasets. A middle path: block training crawlers (GPTBot, ClaudeBot, Google-Extended, CCBot) but allow live-search crawlers (ChatGPT-User, PerplexityBot, OAI-SearchBot). This tool makes that split a one-click toggle.

Can I have multiple sitemap URLs?

Yes. Add multiple Sitemap: lines, one per line. Useful for large sites with separate sitemaps for pages, posts, products, images, video, news, etc. Google reads them all on the same crawl.

What does the asterisk (*) mean in robots.txt?

* is a wildcard. User-agent: * means "all bots". In paths, * matches any sequence of characters — /*?sort= blocks any URL containing ?sort=. The $ symbol at the end of a path means "URL ends here" — *.pdf$ matches only URLs that finish with .pdf, not URLs containing .pdf in the middle.

Other Free SEO Tools

SEO Audit

Full-site SEO audit in 30 seconds — Lighthouse scores, security grade, on-page checks.

Open tool →

Schema Validator

Paste any URL, see every schema block, plain-English fixes for what is broken.

Open tool →

Content Brief Generator

Ahrefs-level SEO content brief in 30 seconds with Content Score and SERP feature matrix.

Open tool →

Need help with your SEO strategy?

Our London-based team has delivered over 1,500 SEO projects with a 90% client success rate. Book a free 30-minute consultation — no obligation, no sales pitch, just an honest look at what your site needs.

Book a Free Consultation