llms.txt: format, who reads it, and how to make one

llms.txt is a plain Markdown file you put at the root of your domain, served at yourdomain.com/llms.txt. It lists your most important pages as links with a one-line description for each, so an AI assistant reading your site gets a clean, curated map instead of crawling through navigation, ads, and JavaScript to find what matters. That is the whole idea. It was proposed by Jeremy Howard of Answer.AI in September 2024, and the spec lives at llmstxt.org.

The file has picked up a lot of noise since then, most of it overpromising. So here is the honest version: what the format actually is, who reads it today and who does not, where it genuinely helps, and how to generate one in about two minutes.

What it does that robots.txt does not

robots.txt is a permission file. It tells crawlers which paths they may and may not fetch. It says nothing about what your content means or which pages matter most.

llms.txt is an orientation file. The spec is blunt about the problem it targets: language models have limited context windows, and converting a full HTML page into clean text is “difficult and imprecise.” A curated Markdown index sidesteps both. You hand the model a short, expert-level summary of your site and point it at the exact pages worth reading. The spec authors expect this to matter mostly at inference time, the moment a user asks an assistant something and it goes looking for current information, rather than during model training.

So the two files coexist. robots.txt controls access. llms.txt offers a reading guide for whatever is allowed through. Neither replaces the other.

The exact format

The spec is short, which is the point. A valid llms.txt file has these parts, in this order:

An H1 with the name of the site or project. This is the only required line.
A blockquote with a short summary that carries the key context for reading the rest.
Zero or more Markdown sections (paragraphs, lists) with more detail. No headings in this part.
Zero or more H2 sections, each holding a list of links. Every list item is a Markdown link, optionally followed by a colon and a short note.

One H2 carries special meaning. If you name a section ## Optional, the URLs under it can be skipped when a shorter context is needed. Use it for secondary material. Here is a trimmed example, close to the one the spec ships with:

# FastHTML

> FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore's FT "FastTags" into a library for creating server-rendered hypermedia applications.

Important notes:

- Although parts of its API are inspired by FastAPI, it is not compatible with FastAPI syntax and is not targeted at creating API services.

## Docs

- [FastHTML quick start](https://fastht.ml/docs/tutorials/quickstart_for_web_devs.html.md): A brief overview of many FastHTML features
- [HTMX reference](https://github.com/bigskysoftware/htmx/blob/master/www/content/reference.md): All HTMX attributes, CSS classes, headers, events, extensions, and config options

## Optional

- [Starlette full documentation](https://gist.githubusercontent.com/jph00/809e4a4808d4510be0e3dc9565e9cbd3/raw/starlette-sml.md): A subset of the Starlette docs useful for FastHTML

One detail people miss: the spec also proposes a companion habit. For any page worth feeding to a model, publish a clean Markdown version at the same URL with .md appended, so /docs/guide.html also answers at /docs/guide.html.md. The links in your llms.txt should point at those Markdown versions when they exist, not the HTML.

Who actually reads it today

There is a real difference between companies that publish an llms.txt and systems that actually consume yours, and most coverage blurs the two. Plenty of well-known sites publish one. Far fewer systems have been documented fetching and using third-party llms.txt files.

Google is the clearest case, because it has said so out loud. At a Search Central event in July 2025, Gary Illyes stated that Google does not support llms.txt and is not planning to. John Mueller put it more bluntly earlier that summer, saying no AI system was using the file at the time, and compared the idea to the old keywords meta tag: a self-declared signal that is trivial to game and therefore easy to ignore. Google’s own guidance for showing up in AI Overviews is the same advice as always, do normal SEO so Googlebot can crawl and index your real pages.

The other assistants are murkier, and honesty means saying so. None of OpenAI, Anthropic, or Perplexity has published documentation committing their crawlers to read your llms.txt. What exists instead is scattered first-party evidence. One operator posted server logs showing an OpenAI crawler hitting his llms.txt every fifteen minutes across a few sites. Meanwhile, broader server-log analyses keep finding the opposite: AI crawlers rarely request /llms.txt at all. Both things are true. A handful of sites see real fetches; most see almost none.

The place llms.txt is unambiguously useful right now is developer documentation. This is the use case the spec was built around, and it is the one with real adoption. The llms-full.txt convention was developed by Mintlify together with Anthropic so that a person can paste one link into a coding assistant and load an entire docs set into context. That works today because the human is doing the fetching, on demand, for a tool that benefits from clean docs. No crawler guesswork required.

llms.txt vs llms-full.txt

These are two different files that often live side by side.

llms.txt is the index. Titles, links, one-line descriptions. It tells a model what exists and where to find it.
llms-full.txt is the whole thing. It concatenates your actual documentation content into one large Markdown file, meant to be dropped straight into a context window.

Use the index when you want a model to navigate and fetch selectively. Use the full file when someone wants to load everything at once into a coding assistant or a chat. Most documentation sites that bother with this publish both.

llms.txt vs robots.txt, side by side

Question	robots.txt	llms.txt
What is it for	Granting or denying crawler access	Pointing AI at your best content
Who honors it	Most major crawlers, including AI bots	Mostly developer tools, on demand
Format	Plain text directives	Markdown
Affects Google ranking	Indirectly, by allowing crawl	No, by Google’s own statement
Risk if wrong	High, you can block real traffic	Low, it is mostly ignored

The last row is the practical one. A bad robots.txt can make you invisible to AI assistants that fetch live pages when a user asks about you. A bad llms.txt mostly just sits there. That asymmetry should shape how much time you spend on each. If your underlying question is whether llms.txt actually moves your AI visibility, we worked through that in a separate post.

Common mistakes

Publishing llms.txt while blocking AI bots in robots.txt. The two files contradict each other constantly. You invite assistants to read a curated index, then your robots.txt disallows GPTBot or ClaudeBot from fetching the pages it points to. Check the access side first.
Treating it as a ranking lever. It is not one, at least not for Google, and there is no public evidence it lifts visibility on the other assistants either. Publish it because clean, well-described content is good hygiene, not because you expect a rankings bump.
Listing URLs that drift from reality. Links that 404, redirect, or describe a page that has since changed make the file worse than having none. If you cannot keep it current, keep it small.
Pointing at HTML when a Markdown version exists. The spec’s value comes from clean Markdown. Link the .md versions of pages where you have them.
Serving it with the wrong content type. It should return as plain text or Markdown at the root path, not get trapped behind an HTML template or a redirect.

Generate one in about two minutes

You can write the file by hand from the format above. If you would rather not, our free llms.txt generator crawls your sitemap, pulls titles and descriptions, and produces a spec-compliant file you can drop at your root. No account, no login.

The free SurfacedBy llms.txt generator: a Website Domain input with an https://example.com placeholder, a Generate llms.txt button, and a row of Secure Check, Instant Results, and Up to 10 pages labels

Then deploy it like any static file. Upload it to your web root next to robots.txt so it answers at yourdomain.com/llms.txt. No server config changes required.

One more thing worth doing while you are in there. Before you spend any effort curating content for AI, confirm the assistants can reach it at all, which we think should come before any GEO work. Our robots.txt AI bot checker tells you whether ChatGPT, Claude, Gemini, and Perplexity are allowed to fetch your pages. That is the file that can actually make you disappear, so it is the one to get right first.

llms.txt: what it is, who actually reads it, and how to make one

What it does that robots.txt does not

The exact format

Who actually reads it today

llms.txt vs llms-full.txt

llms.txt vs robots.txt, side by side

Common mistakes

Generate one in about two minutes

Related reading

Google Preferred Sources: What They Are and Whether They Move Your AI Visibility

Agentic Commerce: How to Get Your Products Recommended and Bought Inside AI Assistants

What AI Visibility Actually Means