Original research · June 2026

We Checked the robots.txt of 41 Top AI Tools: Who Blocks ChatGPT, Claude & Perplexity?

AI search is the new front door, and a single line in robots.txt decides whether ChatGPT, Claude and Perplexity can read and cite a site. So we read the live robots.txt of 41 leading AI and SaaS tools and scored each against the 10 biggest AI crawlers. The result is not what most people assume.

Key findings

  • 36 of 41 tools (88%) block no AI crawler at all. The "AI companies are walling off their data" narrative does not show up in the robots.txt of the tools themselves.
  • Only 5 tools block any AI bot, and GPTBot is the single most-blocked at just 7% of sites.
  • Citation crawlers are almost never blocked. The bots that feed ChatGPT and Perplexity answers (OAI-SearchBot, PerplexityBot, ChatGPT-User) are blocked by at most one site each, so these tools stay visible in AI search.
  • The two real blockers split on strategy. Figma blocks six crawlers including the citation bots, so its pages cannot be cited by ChatGPT or Perplexity. Canva blocks four, but only training bots, keeping citation bots open, which is the correct way to do it.
88%
block nothing
36 of 41 tools
7%
block GPTBot
the most-blocked bot
2%
block a citation bot
1 of 41 tools
5
tools block anything
out of 41 checked

The headline: almost nobody is blocking

The common story in 2026 is that every site is racing to lock AI crawlers out. The tools that build AI are not doing that. Of 41 leading AI and SaaS products we checked, 36 (88%) carry a robots.txt that blocks none of the major AI bots. Their marketing pages, docs, blogs and comparison content are all fully readable by GPTBot, ClaudeBot, PerplexityBot and the rest. For a marketing site that is arguably the right call: being readable is how you get cited, and being cited in an AI answer is the new top of the funnel.

Citation bots vs training bots: the distinction that matters

Not all AI crawlers are the same, and that is the part most robots.txt files get wrong. A citation crawler (OAI-SearchBot, ChatGPT-User, PerplexityBot, Perplexity-User, Claude-User) fetches a page so an AI answer engine can quote and link it. Blocking those bots removes you from AI search results, which is pure lost traffic. A training crawler (GPTBot, ClaudeBot, CCBot, Bytespider, Google-Extended) collects pages to train a model, and blocking those costs you nothing in traffic. In our sample the citation bots are blocked by at most one site each, while the most-blocked bot overall is GPTBot, a training crawler, at 7%. So the few blocks that exist are mostly aimed, correctly, at training rather than citation.

AI crawlerTypeTools blocking itShareWhat blocking costs
GPTBotTraining3 of 417%blocking is free (no traffic cost)
CCBot (Common Crawl)Training3 of 417%blocking is free (no traffic cost)
BytespiderTraining2 of 415%blocking is free (no traffic cost)
OAI-SearchBotCitation1 of 412%blocking loses AI-search traffic
ChatGPT-UserCitation1 of 412%blocking loses AI-search traffic
ClaudeBotTraining1 of 412%blocking is free (no traffic cost)
PerplexityBotCitation1 of 412%blocking loses AI-search traffic
Google-ExtendedTraining1 of 412%blocking is free (no traffic cost)
Claude-UserCitation0 of 410%blocking loses AI-search traffic
Perplexity-UserCitation0 of 410%blocking loses AI-search traffic

The 5 tools that actually block something

Only 5 of 41 tools block any AI crawler, and they fall into two camps. Figma is the strictest and, for AI search, the most self-defeating: it blocks the citation bots too, so Figma's own pages cannot be surfaced or cited inside ChatGPT or Perplexity answers. Canva shows the smarter pattern, blocking training bots (GPTBot, ClaudeBot, CCBot, Bytespider) while leaving the citation bots open, so it denies free training data without giving up AI-search visibility. The rest block a single training bot, usually GPTBot or CCBot.

ToolAI crawlers it blocksCount
FigmaGPTBot, OAI-SearchBot, ChatGPT-User, PerplexityBot, Google-Extended, CCBot (Common Crawl)6 of 10
CanvaGPTBot, ClaudeBot, CCBot (Common Crawl), Bytespider4 of 10
DescriptBytespider1 of 10
LoomGPTBot1 of 10
CalendlyCCBot (Common Crawl)1 of 10
Methodology. In June 2026 we fetched the live robots.txt at the root domain of 41 well-known AI and SaaS tools and parsed each for 10 major AI crawlers, using the standard rule that a bot's own user-agent group overrides the wildcard group. A tool is counted as "blocking" a bot when its applicable group disallows the whole site (Disallow: /) with no equally broad Allow. Path-level rules that only restrict sub-folders are not counted as a block. 3 tools (Midjourney, Gamma, Fathom) returned no parseable plain-text robots.txt at fetch time and are excluded. robots.txt is a request, not an enforcement mechanism, and a site can change it any day, so this is a snapshot. Want the same check on your own site? Use our free AI Crawler Access Checker.

Cite this study

This is free, original research under a CC BY 4.0 license. Use any figure on your own site with a link back, and copy the citation below.

AI Tools Insider (2026). AI Crawler Access in 41 Top AI Tools. Retrieved from https://aitoolsinsiderhq.com/ai-crawler-study.html

Want the raw data? The full dataset (CSV + JSON) and a tiny no-dependency script to check any site are open-source on GitHub: AI Crawler Block List 2026 (MIT / CC BY 4.0).

Check your own site in 30 seconds

Want to know which AI crawlers your site allows or blocks? Paste your robots.txt into our free AI Crawler & robots.txt Access Checker. It flags all 18 major AI bots, tags each as a citation or training crawler, and hands you a recommended robots.txt that keeps AI-search visibility while denying free training.

Serious about AI-search visibility across a real site? Semrush is the all-in-one suite we use to track rankings and technical SEO, and it now reports on AI Overviews too.

That Semrush link is an affiliate link; it costs you nothing extra and we only run it on tools we use ourselves.

Questions about AI crawlers and robots.txt

Do top AI tools block AI crawlers like GPTBot?

Mostly no. Of the 41 leading AI and SaaS tools we checked, 36 (88%) block no AI crawler at all. Only 5 block any AI bot, and GPTBot is the single most-blocked at just 7% of sites.

What is the difference between a citation crawler and a training crawler?

A citation crawler (OAI-SearchBot, ChatGPT-User, PerplexityBot) fetches pages so an AI answer engine can cite them, so blocking it costs you AI-search traffic. A training crawler (GPTBot, ClaudeBot, CCBot) collects pages to train models, so blocking it is free. The smart setup blocks training bots and allows citation bots.

Which AI tools block the most crawlers?

Figma blocks the most (six, including the citation bots that feed ChatGPT and Perplexity, so its pages cannot be cited there). Canva blocks four, but only training bots, keeping citation bots open, which is the textbook-correct setup. Most other tools block nothing.

How can I check which AI crawlers my own site blocks?

Paste your robots.txt into our free AI Crawler and robots.txt Access Checker. It shows which of 18 major AI bots you allow or block, tags each as citation or training, and generates a recommended robots.txt.

More on AI search visibility

See whether ChatGPT and Perplexity can actually read your pages, or browse the free tools we built for AI-search optimization.

AI Search Visibility Checker  ·  All free SEO tools →