SEO Tools
Free Robots.txt Generator — Block AI Bots & Set Directives
Create a valid robots.txt file in seconds. Select a CMS preset, add custom allow/disallow rules, instantly block AI scrapers with one click, and verify your paths with the live URL tester. Free, no login required.
Block AI Scrapers
Automatically disallow GPTBot, ClaudeBot, CCBot, Google-Extended and others from training on your content.
Custom Rules
Generated robots.txt
# Generated by PursTech Robots.txt Generator # https://www.purstech.com/tools/robots-txt-generator User-agent: * Disallow: /admin/
Live URL Tester
Verify if a specific URL or path is allowed for a bot based on your generated rules above.
Common Bots Guide
*—
Wildcard — applies to all crawlers
GooglebotGoogle
Google's main search crawler
BingbotMicrosoft
Microsoft Bing search crawler
SlurpYahoo
Yahoo search crawler
DuckDuckBotDDG
DuckDuckGo search crawler
GPTBotOpenAI
OpenAI — trains ChatGPT on your content
ClaudeBotAnthropic
Anthropic — trains Claude AI
CCBotCC
Common Crawl — used for AI training data
Google-ExtendedGoogle
Google — trains Gemini AI models
anthropic-aiAnthropic
Anthropic — older crawler for Claude
Cohere-aiCohere
Cohere — trains enterprise AI models
OmgilibotOmgili
Omgili — scrapers for AI and analysis
FacebookBotMeta
Facebook — crawls for link previews/AI
DiffbotDiffbot
Diffbot — extracts structured data
How to Create a Robots.txt File
1
Choose a preset
Select WordPress, Shopify or Next.js to instantly populate the recommended allow and disallow paths for your CMS.
2
Add custom rules
Add specific paths you want to block (Disallow) or allow. Remember to always start paths with a forward slash (/). Select which bot the rule applies to.
3
Block AI Scrapers
Toggle the Block AI switch to instantly append rules that stop GPTBot, Claude, CCBot and others from training on your website's content.
4
Test & Download
Use the Live Tester to ensure your private paths are actually blocked. Once verified, click Download and place the file in your website's root directory.
❓ Frequently Asked Questions
What is a robots.txt file?+
A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests, or to keep certain pages out of Google. It is not a mechanism for keeping a web page out of Google. To keep a web page out of Google, block indexing with noindex or password-protect the page.
Where should I put my robots.txt file?+
The robots.txt file must be located at the root of the website host to which it applies. For example, to control crawling on all URLs below https://www.example.com/, the robots.txt file must be located at https://www.example.com/robots.txt.
How do I block AI bots like GPTBot or Claude?+
You can block specific AI bots by targeting their User-Agent. Our generator includes a 1-click toggle to block the most common AI scrapers (GPTBot, ClaudeBot, CCBot, Google-Extended, etc.) from training their language models on your content.
What does 'User-agent: *' mean?+
The asterisk (*) is a wildcard. 'User-agent: *' means the rule applies to all web crawlers, except those that have their own specific User-agent block.
How does the Sitemap directive work in robots.txt?+
You can point crawlers to your XML sitemap by adding a line at the bottom of your robots.txt file: Sitemap: https://yoursite.com/sitemap.xml. This helps all search engines discover your sitemap automatically. You can include multiple Sitemap lines for multiple sitemap files. This complements but does not replace submitting your sitemap directly in Google Search Console.