eg

ai bots robots.txt auditor.

check which of 20+ ai bots your robots.txt actually blocks

> worked example

A brand CMO pastes Shopify's robots.txt into the auditor. The tool shows Googlebot and Bingbot are explicitly allowed, but GPTBot, ClaudeBot, Bytespider, and Amazonbot are all unmentioned, meaning they crawl freely by default. Switching the policy toggle to 'block all AI training bots' instantly generates the Disallow: / blocks needed for all nine training crawlers that were absent.

takeaway, Most ecommerce robots.txt files were written before LLM crawlers existed, the majority of AI training bots are unmentioned and therefore allowed.

> when operators reach for this

  • Shopify SEO leads who want to block training crawlers but keep LLM search bots (ChatGPT-User, PerplexityBot) allowed for AI-driven traffic.
  • Brand CMOs worried about content being scraped into LLM training sets without consent, using the auditor to generate blocking rules for all nine training crawlers.
  • Ecommerce data leads auditing a competitor's robots.txt to understand how they handle AI bot access.
  • Agencies running technical SEO audits and needing a quick, shareable snapshot of a client's AI bot posture.
  • Headless commerce developers verifying that a newly deployed robots.txt correctly reflects the brand's crawl access policy across all 18+ audited bots.

> the calculation

  • longest-match precedencemost-specific Allow / Disallow path wins per urle.g. Allow: /products beats Disallow: / for product pages.
  • user-agent matchingexact bot name groups override * wildcard groupsA GPTBot group takes full precedence over a * group, even if * would disallow.

> related calculators, ai & llm visibility