What it is
Content Signals is a 2024 proposal that extends robots.txt with a Content-Signal directive. It separates three independent decisions:
ai-train— may this content be used as training data?ai-input— may this content be used as live input to inference (RAG, grounding, assistant context)?search— may this content be indexed for search?
Values are yes / no, comma-separated.
User-agent: *
Content-Signal: search=yes, ai-input=yes, ai-train=no
Why it matters
- Lets owners make the training/inference distinction that a simple Allow/Disallow cannot.
- Widely adopted publishers (Reuters, AP, major news sites) already use it.
- Complements bot-specific rules.
Remediation Prompt
I want to improve my site's agent readiness. Please implement the following fix for Content Signals across our codebase. Instructions: Please fix the Content Signals issue on my site so it is agent-ready.
How we test it
Re-use /robots.txt from check 01. Look for Content-Signal: lines inside any User-agent: group.
Pass Warn Fail Matrix
| Condition | Status | Score |
|---|---|---|
≥1 Content-Signal: with at least 2 of the 3 keys explicitly set |
pass | 1.0 |
Content-Signal: present but only 1 key or values outside {yes,no} |
warn | 0.5 |
No Content-Signal: anywhere |
fail | 0.0 |
Sub Tests
| id | Weight | Pass when |
|---|---|---|
directive-present |
0.6 | At least one valid directive |
complete |
0.4 | All three keys (ai-train, ai-input, search) set explicitly |
Remediation Prompt
Please add a Content-Signal directive to my /robots.txt. It should sit under at least the User-agent: * group and declare three AI-related preferences:
User-agent: *
Content-Signal: search=<yes|no>, ai-input=<yes|no>, ai-train=<yes|no>
Typical choices for content/publisher sites:
Content-Signal: search=yes, ai-input=yes, ai-train=no
(Allow search, allow AI-assisted inference on my content, disallow use as training data.)
Typical choices for docs or open projects:
Content-Signal: search=yes, ai-input=yes, ai-train=yes
You may add Content-Signal under specific bot user-agents too, to vary per bot.
Do not modify existing Allow/Disallow rules.
Implementation Examples
See check 01 — robots.txt examples. Just add the Content-Signal: line.
Common Mistakes
Content-Signals:(plural) — spec uses singular.- Values other than
yes/no(1/0,true/false). - Missing commas between key-value pairs.
- Placing outside a
User-agent:group — must be inside.
Test Fixtures
pass-all-three.txtwarn-only-one-key.txtwarn-wrong-value.txtfail-missing.txt