AI Index Check

AI search guide

What Is llms.txt? A Practical Guide for AI Search

Learn what llms.txt is, where it lives, what to include, and why it should be treated as an emerging AI discovery convention.

What llms.txt is

llms.txt is an emerging convention for publishing a concise list of public resources that may help AI assistants and answer engines understand a site.

It is not a crawler access rule, a ranking guarantee, or proof that any model will crawl, train on, display, or cite the content.

What llms.txt is not

llms.txt is not an access-control file, a ranking signal guarantee, a training opt-in contract, or proof that a model will cite a page.

Use it as a curated discovery aid. Keep policy decisions in robots.txt, indexability decisions in page metadata and canonical strategy, and URL discovery in sitemap.xml.

Where llms.txt should live

The expected location is the site root, such as https://example.com/llms.txt. The file should be plain text or markdown-like text, easy to fetch, and focused on canonical public resources.

How llms.txt differs from robots.txt and sitemap.xml

robots.txt communicates crawl policy for compliant crawlers. sitemap.xml lists canonical URLs for search discovery workflows. llms.txt is different: it summarizes high-value public context for AI-assisted readers.

The three files should not contradict each other. Do not list a page in llms.txt if robots.txt blocks it or if canonical strategy points elsewhere.

What to include

Strong files usually include product overview pages, documentation, pricing, support, policies, API references, changelogs, and other pages that answer common questions directly.

Avoid private URLs, staging hosts, login-only content, tracking-heavy links, and pages blocked by robots.txt.

Related AI Index Check tools