# rote™ by modiqo — full agent / crawler access. # We want LLMs and AI search to discover, cite, and link to our content. # # IMPORTANT FOR AGENTS: .md URLs are intentionally not published because this # host serves unknown markdown extensions as application/octet-stream. Every # machine-readable page is available as a .txt mirror instead. The llms.txt / # llms-full.txt indexes only reference .txt URLs. We Disallow .md below so # well-behaved crawlers do not waste budget on unsupported legacy paths. # ── Search engines ───────────────────────────────────────────── User-agent: Googlebot Allow: / Disallow: /*.md$ User-agent: Googlebot-Image Allow: / Disallow: /*.md$ User-agent: Bingbot Allow: / Disallow: /*.md$ User-agent: DuckDuckBot Allow: / Disallow: /*.md$ User-agent: Applebot Allow: / Disallow: /*.md$ User-agent: Slurp Allow: / Disallow: /*.md$ User-agent: YandexBot Allow: / Disallow: /*.md$ User-agent: Baiduspider Allow: / Disallow: /*.md$ # ── Social previews ──────────────────────────────────────────── User-agent: Twitterbot Allow: / User-agent: facebookexternalhit Allow: / User-agent: FacebookBot Allow: / User-agent: LinkedInBot Allow: / User-agent: Slackbot Allow: / User-agent: Discordbot Allow: / User-agent: TelegramBot Allow: / # ── AI / LLM crawlers (training + retrieval + answer engines) ── User-agent: GPTBot Allow: / Disallow: /*.md$ User-agent: ChatGPT-User Allow: / Disallow: /*.md$ User-agent: OAI-SearchBot Allow: / Disallow: /*.md$ User-agent: ClaudeBot Allow: / Disallow: /*.md$ User-agent: Claude-Web Allow: / Disallow: /*.md$ User-agent: anthropic-ai Allow: / Disallow: /*.md$ User-agent: PerplexityBot Allow: / Disallow: /*.md$ User-agent: Perplexity-User Allow: / Disallow: /*.md$ User-agent: Google-Extended Allow: / Disallow: /*.md$ User-agent: Applebot-Extended Allow: / Disallow: /*.md$ User-agent: Amazonbot Allow: / Disallow: /*.md$ User-agent: Bytespider Allow: / Disallow: /*.md$ User-agent: CCBot Allow: / Disallow: /*.md$ User-agent: cohere-ai Allow: / Disallow: /*.md$ User-agent: Meta-ExternalAgent Allow: / Disallow: /*.md$ User-agent: Meta-ExternalFetcher Allow: / Disallow: /*.md$ User-agent: YouBot Allow: / Disallow: /*.md$ User-agent: Diffbot Allow: / Disallow: /*.md$ User-agent: DuckAssistBot Allow: / Disallow: /*.md$ User-agent: MistralAI-User Allow: / Disallow: /*.md$ # ── Default: allow everything except .md ─────────────────────── User-agent: * Allow: / Disallow: /*.md$ # Sitemap + LLM indexes # `Sitemap:` is the standard directive (RFC-ish, supported by all major crawlers). # `LLMs:` and `LLM-Full:` are non-standard hints from the llmstxt.org convention — # tolerated by parsers that ignore unknown directives, useful for AI-aware crawlers # that look for them. Sitemap: https://www.modiqo.ai/sitemap.xml LLMs: https://www.modiqo.ai/llms.txt LLM-Full: https://www.modiqo.ai/llms-full.txt