SEO Tools

Free Robots.txt Generator — Block AI Bots & Set Directives

Create a valid robots.txt file in seconds. Select a CMS preset, add custom allow/disallow rules, instantly block AI scrapers with one click, and verify your paths with the live URL tester. Free, no login required.

CMS Preset

Sitemap URL (Optional)

Block AI Scrapers

Automatically disallow GPTBot, ClaudeBot, CCBot, Google-Extended and others from training on your content.

Custom Rules

Generated robots.txt

# Generated by PursTech Robots.txt Generator
# https://www.purstech.com/tools/robots-txt-generator

User-agent: *
Disallow: /admin/

Live URL Tester

Verify if a specific URL or path is allowed for a bot based on your generated rules above.

Select Bot to Test

Test URL or Path

Common Bots Guide

*—

Wildcard — applies to all crawlers

GooglebotGoogle

Google's main search crawler

BingbotMicrosoft

Microsoft Bing search crawler

SlurpYahoo

Yahoo search crawler

DuckDuckBotDDG

DuckDuckGo search crawler

GPTBotOpenAI

OpenAI — trains ChatGPT on your content

ClaudeBotAnthropic

Anthropic — trains Claude AI

CCBotCC

Common Crawl — used for AI training data

Google-ExtendedGoogle

Google — trains Gemini AI models

anthropic-aiAnthropic

Anthropic — older crawler for Claude

Cohere-aiCohere

Cohere — trains enterprise AI models

OmgilibotOmgili

Omgili — scrapers for AI and analysis

FacebookBotMeta

Facebook — crawls for link previews/AI

DiffbotDiffbot

Diffbot — extracts structured data

How to Create a Robots.txt File

Choose a preset

Select WordPress, Shopify or Next.js to instantly populate the recommended allow and disallow paths for your CMS.

Add custom rules

Add specific paths you want to block (Disallow) or allow. Remember to always start paths with a forward slash (/). Select which bot the rule applies to.

Block AI Scrapers

Toggle the Block AI switch to instantly append rules that stop GPTBot, Claude, CCBot and others from training on your website's content.

Test & Download

Use the Live Tester to ensure your private paths are actually blocked. Once verified, click Download and place the file in your website's root directory.

❓ Frequently Asked Questions

What is a robots.txt file?+

A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests, or to keep certain pages out of Google. It is not a mechanism for keeping a web page out of Google. To keep a web page out of Google, block indexing with noindex or password-protect the page.

Where should I put my robots.txt file?+

The robots.txt file must be located at the root of the website host to which it applies. For example, to control crawling on all URLs below https://www.example.com/, the robots.txt file must be located at https://www.example.com/robots.txt.

How do I block AI bots like GPTBot or Claude?+

You can block specific AI bots by targeting their User-Agent. Our generator includes a 1-click toggle to block the most common AI scrapers (GPTBot, ClaudeBot, CCBot, Google-Extended, etc.) from training their language models on your content.

What does 'User-agent: *' mean?+

The asterisk (*) is a wildcard. 'User-agent: *' means the rule applies to all web crawlers, except those that have their own specific User-agent block.

How does the Sitemap directive work in robots.txt?+

You can point crawlers to your XML sitemap by adding a line at the bottom of your robots.txt file: Sitemap: https://yoursite.com/sitemap.xml. This helps all search engines discover your sitemap automatically. You can include multiple Sitemap lines for multiple sitemap files. This complements but does not replace submitting your sitemap directly in Google Search Console.