aide
content

llms.txt

Does the site publish a well-formed `/llms.txt` that gives an LLM a compact, structured reading list?

What it is

llms.txt is a plain-text markdown file at the root of a site, proposed by llmstxt.org in 2024. Think of it as sitemap.xml rewritten for LLMs — short, opinionated, with descriptions instead of a flat list of URLs.

Canonical structure:

# Site Name
> One-sentence site description for the LLM.

Why it matters

  • A well-designed llms.txt fits an entire product's documentation within an agent's first read.
  • Reduces the grep-loop: the agent reads one file and knows exactly which URL to fetch next.
  • Cloudflare's own docs saw ~31% token reduction and 66% faster correct answers after adopting llms.txt properly.

Remediation Prompt

I want to improve my site's agent readiness. Please implement the following fix for llms.txt across our codebase.

Instructions:
Please fix the llms.txt issue on my site so it is agent-ready.

How we test it

Step Method URL
A GET /llms.txt
B GET /llms-full.txt (observational, does not affect pass/fail but is surfaced in evidence)

Body cap: 1 MB for llms.txt, 5 MB for llms-full.txt.

Section A

Section B


Some sites also publish `/llms-full.txt` — all content concatenated into one file for drop-in context loading.

Pass Warn Fail Matrix

Condition Status Score
/llms.txt exists, parses as markdown with a # Title, ≥1 H2 section, ≥1 link pass 1.0
Exists but structure is thin (<3 links total, no headings) warn 0.5
Exists but returns HTML or is empty fail 0.0
Does not exist (404) fail 0.0

We do not penalise missing /llms-full.txt — it's a bonus signal.

Sub Tests

id Weight Pass when
exists 0.5 200 + content-type text/* + non-empty
well-structured 0.4 Has # Title, ≥1 ## Section, ≥3 bullet links
llms-full-present 0.1 /llms-full.txt is also reachable

Remediation Prompt

Please create an llms.txt for my site at /llms.txt. It must be a single markdown file following the llmstxt.org convention. Follow these rules:

1. Start with a single H1 that names the site.
2. Immediately below, a blockquote (one line) describing the site.
3. For each top-level content area of the site, add a ## H2 section with a bulleted list of the key pages. Each bullet is `[Title](absolute-url): one-sentence description.`
4. Prefer linking to markdown variants of pages (with `.md` extension or content-negotiable URLs).
5. Keep the whole file under ~8,000 tokens. For large sites, use multiple llms.txt files — one at the site root listing top-level directories, and one per directory (e.g. /docs/llms.txt, /api/llms.txt).
6. Exclude low-value directory listing pages and redirects. Point at canonical content only.
7. Also generate an llms-full.txt that concatenates the content of every listed page, in reading order, for drop-in context loading by agents.

Serve both as text/plain; charset=utf-8.

Example:

    # Example
    > A developer platform for building on the edge.

    ## Documentation
    - [Getting Started](https://example.com/docs/start.md): 5-minute onboarding.
    - [API Reference](https://example.com/docs/api.md): Full REST reference.

    ## Changelog
    - [Release Notes](https://example.com/changelog.md): Versioned change history.

Implementation Examples

Next.js route handler

// src/app/llms.txt/route.ts
import { buildLlmsTxt } from '@/lib/llms';
export async function GET() {
  const body = await buildLlmsTxt();
  return new Response(body, { headers: { 'content-type': 'text/plain; charset=utf-8' } });
}

Multi-file strategy (large docs sites)

Root /llms.txt lists pointers:

# Example
> Platform.

Top Level Sections


Each subdirectory's `llms.txt` lists its own children.

Common Mistakes

  • Pointing to HTML URLs instead of markdown-capable ones
  • Dumping every blog post without descriptions (useless for agents)
  • Including all 5000 docs pages in one file → exceeds context
  • Serving the file with content-type: text/markdown (most convention is text/plain; either is acceptable but text/plain is the original proposal)
  • Forgetting absolute URLs

Test Fixtures

  • pass-well-formed.txt
  • pass-with-llms-full.txt
  • warn-thin.txt
  • fail-404.json
  • fail-html.txt (served HTML at /llms.txt)