📄 Technical GEO

llms.txt: The Complete Guide to AI-Readable Content Summaries

Less than 1% of websites have an llms.txt file. It takes 60 seconds to generate. Here is everything you need to know about the format and why it matters for AI discoverability.

27 January 2025·10 min read

llms.txt is a Markdown file at the root of your domain that tells AI language models which pages contain your best content and what those pages cover. It was proposed by Jeremy Howard (fast.ai) in 2024. Less than 1% of websites have one. Implementing it correctly puts you in a tiny minority of sites that AI models can efficiently index and cite.

llms.txt Adoption and Impact

<1%

Sites with llms.txt

As of Q1 2025

Markdown

File format

Human and machine readable

/llms.txt

Standard location

Root of your domain

60s

Generation time

With Innotek GEO Audit

The llms.txt File Format Specification

# Innotek AgenticSEO

> GEO platform for SMEs that analyses website entity clarity, fact density,
> and schema completeness for AI model citation optimisation.
> Founded 2024. Rickmansworth, Hertfordshire, UK.

## Core Platform

- [GEO Audit Overview](https://innotekseoai.com/): Complete AI readiness scoring across Entity Clarity, Fact Density, and Schema Completeness metrics
- [Pricing Plans](https://innotekseoai.com/pricing): Free audit tier, Pro at £29/month (unlimited audits, 15 pages), Enterprise at £299/month (50 pages, API access, MCP integration)
- [About Innotek](https://innotekseoai.com/about): Founded by Innotek Solutions Ltd, Rickmansworth, Hertfordshire. GEO-first platform built for SME AI discoverability

## GEO Guides

- [Entity Clarity Guide](https://innotekseoai.com/articles/entity-clarity-guide): 7-dimension framework for making AI models identify your organisation with confidence
- [Fact Density Playbook](https://innotekseoai.com/articles/fact-density-playbook): 8-step rewriting framework; high-density pages get 3.7× more AI citations
- [Schema Completeness Checklist](https://innotekseoai.com/articles/schema-completeness-checklist): 24 schema types across 4 tiers; 100% completeness = 3.1× citation lift

## Optional

- [Blog](https://innotekseoai.com/blog): GEO research, case studies, and platform updates

llms.txt Component Reference

Component	Syntax	Required	Purpose
Site Heading	# Site Name	Yes	Primary entity name for AI knowledge graph anchoring
Blockquote Description	> Description text	Yes	disambiguatingDescription equivalent — what you do, who you serve, where
Section Headings	## Section Name	Yes (1+)	Group pages by topic cluster for topical authority signalling
Page Links	- Title: Description	Yes (2+)	Individual page entries with context for AI routing decisions
Optional Section	## Optional	No	Lower-priority pages AI may skip when token budget is tight

Token-Optimised Writing for llms.txt

llms.txt vs robots.txt vs sitemap.xml

What it controls

✗ robots.txt — which crawlers can access which paths
✗ sitemap.xml — which URLs exist and when they were updated
✗ llms.txt — which pages contain your best content and what they cover

Who reads it

✓ robots.txt — all crawlers (Googlebot, GPTBot, ClaudeBot, etc.)
✓ sitemap.xml — search engine crawlers for index discovery
✓ llms.txt — AI language model training and retrieval pipelines

llms.txt Best Practices

Use your legal entity name as the H1 — The H1 heading is how AI models will reference you in knowledge graph entries. Use your exact trading name or registered company name — not a marketing tagline or product name.
Write the blockquote as a disambiguatingDescription — The blockquote is treated as your primary entity description. Include what you do (specific verb + object), who you serve (audience definition), and where you operate (geography). This block is always included regardless of token budget.
Match section headings to your Schema.org topic clusters — If your Schema.org markup uses knowsAbout arrays grouped by theme, mirror that structure in your llms.txt sections. Consistent topic signals across both formats reinforce topical authority.
Put your highest fact-density pages first — When token budgets are tight, AI models read top-to-bottom and truncate. Your most citation-worthy pages — the ones with the highest Fact Density Index scores — should appear in the first section.
Update it when you publish major new content — llms.txt is not indexed like a sitemap — AI pipelines fetch it periodically. Update it within 24 hours of publishing new cornerstone content so it gets included in the next retrieval cycle.

Automate Your GEO Compliance

Get full-site AI readiness audits, Schema.org generation, and citation tracking — all automated.

Start free audit →View pricing