๐ Technical GEO
llms.txt: The Complete Guide to AI-Readable Content Summaries
Less than 1% of websites have an llms.txt file. It takes 60 seconds to generate. Here is everything you need to know about the format and why it matters for AI discoverability.
ยท10 min read
llms.txt is a Markdown file at the root of your domain that tells AI language models which pages contain your best content and what those pages cover. It was proposed by Jeremy Howard (fast.ai) in 2024. Less than 1% of websites have one. Implementing it correctly puts you in a tiny minority of sites that AI models can efficiently index and cite.
llms.txt Adoption and Impact
<1%
Sites with llms.txt
As of Q1 2025
Markdown
File format
Human and machine readable
/llms.txt
Standard location
Root of your domain
60s
Generation time
With Innotek GEO Audit
The llms.txt File Format Specification
# Innotek AgenticSEO
> GEO platform for SMEs that analyses website entity clarity, fact density,
> and schema completeness for AI model citation optimisation.
> Founded 2024. Rickmansworth, Hertfordshire, UK.
## Core Platform
- [GEO Audit Overview](https://innotekseoai.com/): Complete AI readiness scoring across Entity Clarity, Fact Density, and Schema Completeness metrics
- [Pricing Plans](https://innotekseoai.com/pricing): Free audit tier, Pro at ยฃ29/month (unlimited audits, 15 pages), Enterprise at ยฃ299/month (50 pages, API access, MCP integration)
- [About Innotek](https://innotekseoai.com/about): Founded by Innotek Solutions Ltd, Rickmansworth, Hertfordshire. GEO-first platform built for SME AI discoverability
## GEO Guides
- [Entity Clarity Guide](https://innotekseoai.com/articles/entity-clarity-guide): 7-dimension framework for making AI models identify your organisation with confidence
- [Fact Density Playbook](https://innotekseoai.com/articles/fact-density-playbook): 8-step rewriting framework; high-density pages get 3.7ร more AI citations
- [Schema Completeness Checklist](https://innotekseoai.com/articles/schema-completeness-checklist): 24 schema types across 4 tiers; 100% completeness = 3.1ร citation lift
## Optional
- [Blog](https://innotekseoai.com/blog): GEO research, case studies, and platform updates
llms.txt Component Reference
| Component | Syntax | Required | Purpose |
|---|---|---|---|
| Site Heading | # Site Name | Yes | Primary entity name for AI knowledge graph anchoring |
| Blockquote Description | > Description text | Yes | disambiguatingDescription equivalent โ what you do, who you serve, where |
| Section Headings | ## Section Name | Yes (1+) | Group pages by topic cluster for topical authority signalling |
| Page Links | - Title: Description | Yes (2+) | Individual page entries with context for AI routing decisions |
| Optional Section | ## Optional | No | Lower-priority pages AI may skip when token budget is tight |
Token-Optimised Writing for llms.txt
llms.txt vs robots.txt vs sitemap.xml
What it controls
- โ robots.txt โ which crawlers can access which paths
- โ sitemap.xml โ which URLs exist and when they were updated
- โ llms.txt โ which pages contain your best content and what they cover
Who reads it
- โ robots.txt โ all crawlers (Googlebot, GPTBot, ClaudeBot, etc.)
- โ sitemap.xml โ search engine crawlers for index discovery
- โ llms.txt โ AI language model training and retrieval pipelines
llms.txt Best Practices
- Use your legal entity name as the H1 โ The H1 heading is how AI models will reference you in knowledge graph entries. Use your exact trading name or registered company name โ not a marketing tagline or product name.
- Write the blockquote as a disambiguatingDescription โ The blockquote is treated as your primary entity description. Include what you do (specific verb + object), who you serve (audience definition), and where you operate (geography). This block is always included regardless of token budget.
- Match section headings to your Schema.org topic clusters โ If your Schema.org markup uses knowsAbout arrays grouped by theme, mirror that structure in your llms.txt sections. Consistent topic signals across both formats reinforce topical authority.
- Put your highest fact-density pages first โ When token budgets are tight, AI models read top-to-bottom and truncate. Your most citation-worthy pages โ the ones with the highest Fact Density Index scores โ should appear in the first section.
- Update it when you publish major new content โ llms.txt is not indexed like a sitemap โ AI pipelines fetch it periodically. Update it within 24 hours of publishing new cornerstone content so it gets included in the next retrieval cycle.