Resource

robots.txt, llms.txt, and the Future of Web Permissions

The web is gaining new permission signals. Here is how classic crawler rules and AI-era policy files can work together.

May 6, 2026 7 min read

robots.txt remains important

The robots.txt file is simple, familiar, and easy for crawlers to discover. Its restraint is part of its value. But modern content use cases increasingly need more nuance than allow or disallow by path.

llms.txt points toward curated context

The llms.txt convention reflects a practical need: site owners want to tell language models where the most useful context lives. That can improve answers and reduce noisy crawling, especially for documentation-heavy sites.

The missing layer is consent history

Permission signals become more useful when they include intent, categories, policy ownership, and change history. The future likely includes multiple files and APIs working together, with consent policies acting as the source of truth.