Crawler directory

AI Crawler Directory

Compare Googlebot, Google-Extended, GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, CCBot, Applebot-Extended, Amazonbot, and Bytespider before changing robots.txt. Search discovery, model training, AI answers, page retrieval, and control tokens are separate policy surfaces.

Open AI crawler robots.txt checker Read AI crawlers robots.txt guide

What this crawler directory is for

This directory explains which crawler user-agent token to test when you run a Google robots check, an AI crawler test, or a robots.txt checker test allow block crawler workflow. Each crawler page shows the owner, user agent string, policy effect, examples, and links back to the checker.

Priority crawler references

Googlebot Google-Extended GPTBot ChatGPT-User OAI-SearchBot ClaudeBot PerplexityBot

OAI-SearchBot

OpenAI: OpenAI search discovery for ChatGPT search features.

Last verified 2026-06-11

GPTBot

OpenAI: OpenAI crawler for content that may be used to train generative AI models.

Last verified 2026-06-11

ChatGPT-User

OpenAI: User-initiated ChatGPT and Custom GPT requests.

Last verified 2026-06-11

Googlebot

Google: Google Search crawling and indexing.

Last verified 2026-06-11

Google-Extended

Google: Control token for certain Gemini and Vertex AI uses outside Google Search.

Last verified 2026-06-11

ClaudeBot

Anthropic: Anthropic automated web crawler.

Last verified 2026-06-11

PerplexityBot

Perplexity: Perplexity search and retrieval crawler.

Last verified 2026-06-11

CCBot

Common Crawl: Common Crawl dataset collection.

Last verified 2026-06-11

Applebot-Extended

Apple: Apple control token for use of web content in certain generative AI models.

Last verified 2026-06-11

Amazonbot

Amazon: Amazon web crawler used across Amazon services.

Last verified 2026-06-11

Bytespider

ByteDance: ByteDance crawler; a dedicated public purpose statement was not verified.

Last verified 2026-06-11

How these references are maintained

Roles are checked against primary documentation and keep a manually maintained verification date. Allowing a crawler does not guarantee crawling, indexing, inclusion, ranking, or citation. Read the methodology and source policy.

Use these pages as crawler-specific context, then test the exact URL path with the AI crawler robots.txt checker. A directory page describes policy meaning; the checker reports the deployed rule that matches a submitted site and path.