llms.txt is no longer one file — alongside the index you publish a full-content llms-full.txt, plus content negotiation that returns the same URL as markdown to agents and HTML to browsers. AIDE's markdown-negotiation and llms-full-txt-pass checks test exactly this dual-mode delivery.
What AIDE actually checks
markdown-negotiation probes all three paths:
- With
Accept: text/markdown, does the same URL return markdown? - With
?format=md, does the same URL return markdown? - Is
Vary: Acceptset correctly (otherwise the CDN cache mixes representations)?
llms-full-txt-pass additionally verifies that /llms-full.txt (a) returns 200 and (b) has a non-trivial content hash — i.e. it's actually populated, not a placeholder.
Why two paths (Accept header + query string)?
In theory Vary: Accept alone is enough. In practice:
- Cloudflare Free + Pro don't fully respect
Vary: Accept— cached HTML can leak to an agent asking for markdown - Some LLM gateways strip Accept headers
- Manual testing is easier with a query string for humans
Keep both.
Minimum Caddy setup (live aide.tr config)
This is a copy-paste of the production Caddyfile on aide.tr:
aide.tr, www.aide.tr {
encode gzip zstd
# 1) Accept-header negotiation
@wantsMarkdown {
header_regexp Accept text/markdown
path / /tr /tr/ /en /en/
}
rewrite @wantsMarkdown /llms-full.txt
header @wantsMarkdown Vary "Accept, Accept-Encoding"
# 2) Query-string fallback — CF cache key includes the query string,
# so /?format=md and / are separate entries — no leak
@formatMd {
query format=md
path / /tr /tr/ /en /en/
}
rewrite @formatMd /llms-full.txt
# 3) Link header — RFC 8288, agents discover the alternate
# representation in a single HEAD
header {
Link "<https://aide.tr/llms-full.txt>; rel=\"alternate\"; type=\"text/markdown\"; title=\"LLM-friendly full content\""
}
reverse_proxy web:3000 {
header_up X-Forwarded-Proto https
}
}
Three building blocks:
@wantsMarkdownmatcher: when Accept matchestext/markdownregex, rewrite (not redirect) — agent asked for the same URL, gets the same URL.@formatMdmatcher: query-string fallback. Caddy treats this as a separate cache entry, eliminating leak risk.Linkheader: RFC 8288 —rel=alternateadvertises the markdown variant in a single HEAD probe.
Next.js side: generating llms-full.txt
Not a static file but a dynamic route. app/llms-full.txt/route.ts:
// app/llms-full.txt/route.ts
import { listLearnArticles, listFeatured } from "@/lib/learn"
import { NextResponse } from "next/server"
export const revalidate = 3600 // 1 hour
export async function GET() {
const [articles, featured] = await Promise.all([
listLearnArticles("en"),
listFeatured(20, "en"),
])
const lines: string[] = []
// Header
lines.push("# AIDE — AI Detect Engine")
lines.push("> Is your site ready for AI agents?")
lines.push("")
// Featured articles — full content
for (const article of featured) {
lines.push(`## ${article.title}`)
lines.push("")
lines.push(article.body) // raw markdown
lines.push("")
lines.push(`Source: https://aide.tr/en/learn/${article.slug}`)
lines.push("---")
lines.push("")
}
// Other articles — link only
lines.push("## More articles")
lines.push("")
for (const article of articles) {
lines.push(`- [${article.title}](https://aide.tr/en/learn/${article.slug}) — ${article.description}`)
}
const body = lines.join("\n")
return new NextResponse(body, {
headers: {
"Content-Type": "text/markdown; charset=utf-8",
"Cache-Control": "public, max-age=300, s-maxage=3600",
"X-Robots-Tag": "noindex", // crawlers only — keep this URL out of HTML index
},
})
}
Three decisions worth flagging:
| Decision | Why |
|---|---|
| revalidate: 3600 | Hourly rebuild — checking-for-change more often eats build quota |
| s-maxage=3600 but max-age=300 | Edge holds for 1 h; browsers see fresh after 5 min |
| X-Robots-Tag: noindex | Google would otherwise treat this as HTML competition |
llms.txt vs llms-full.txt — what's the difference?
These complement each other:
llms.txt= your map. One link per article, short summary. Small (5–50 KB).llms-full.txt= your library. Full text of featured content + the map. Large (100 KB – 5 MB).
Which does an agent want?
- Information lookup →
llms-full.txt(answer is already there, no extra fetch) - Topical search →
llms.txt(find the link, then fetch only what's needed)
Publish both. AIDE scores all three:
llms-txt-exists(does llms.txt exist?)llms-full-txt-exists(does llms-full.txt exist?)llms-full-txt-pass(is it populated, or just a placeholder?)
Common mistakes
| Mistake | Symptom | Fix |
|---|---|---|
| Only query-string negotiation | markdown-negotiation PASSes in AIDE but Cloudflare leaks HTML in production | Add Vary: Accept header and re-test |
| llms-full.txt is empty / "Coming soon" | llms-full-txt-pass FAIL — content hash near-empty | Populate at build time, no placeholders |
| Content-Type: text/plain | Some parsers don't recognize markdown | text/markdown; charset=utf-8 is mandatory |
| Cache TTL 24 h while content changes every 5 min | Stale markdown delivered to agents | Tune s-maxage to your content cadence; max 1 h |
Testing — what AIDE does
# Accept header test
curl -i https://your-site.com/ -H 'Accept: text/markdown' | head -20
# Expect: Content-Type: text/markdown, body is markdown
# Query-string test
curl -i 'https://your-site.com/?format=md' | head -5
# Expect: same
# Vary header
curl -I https://your-site.com/ | grep -i vary
# Expect: Vary: Accept, Accept-Encoding
# llms-full.txt size
curl -s https://your-site.com/llms-full.txt | wc -c
# Expect: ≥10 000 bytes (no placeholder)
If all three hold, AIDE PASSes the related checks.
Production hardening
- Compression: llms-full.txt can be 1 MB+; gzip/zstd is mandatory —
encode gzip zstdin the Caddy block. - Per-locale: Separate file for TR and EN. URLs are
/llms-full.txt(default lang) and/en/llms-full.txt. - Sitemap + llms.txt sync: Build both from your article list — no drift.
- Analytics: Log llms-full.txt requests separately. Knowing which agents (by UA) hit it most often is strategic.
Related resources
- llmstxt.org spec
- RFC 8288 — Web Linking
- Caddy reverse_proxy + matchers
- AIDE check details:
/learn/markdown-negotiation,/learn/llms-full-txt-exists,/learn/llms-full-txt-pass