How AI crawlers follow redirects (what the research shows in 2026)

The redirect rules that work for Google don't apply to ChatGPT, Claude or Perplexity. Here's what the data actually shows.

SEOsBy Matt Hayles2026-04-0222 mins
Close-up of excavator tracks on a construction site with the text "Redirects vs. AI Crawlers" overlaid in bold green and purple typography.

AI crawlers follow HTTP redirects with a 301 or 302 status code, but can’t follow JavaScript or meta refresh redirects. Research shows that redirects that aren’t served with a standard 301/302 status code are effectively dead ends for the various crawlers deployed by AI companies. Since the crawlers for these AI companies don't execute JavaScript or other client-side functions, those kinds of redirects are invisible to them.

The companies behind Large Language Models (LLMs) deploy a variety of different crawler bots to ingest content for their models and to answer user questions. This includes OpenAI (GPTBot, OAI-SearchBot, ChatGPT-User), Anthropic (ClaudeBot, Claude-SearchBot, Claude-User), Meta (Meta-ExternalAgent) and Perplexity (PerplexityBot). Every major LLM crawler treats your redirect infrastructure the same way a very fast, very literal HTTP client would.

Google's Gemini is the exception because it uses Googlebot's full rendering pipeline, which reads and runs JavaScript and has other advanced features. But AI crawlers are newer, with less mature infrastructure and more resource constraints. They crawl less patiently and less completely, and recrawl even less frequently.

Here’s everything site owners and SEOs need to know about redirects, 404s and crawl frequency in the age of AI search.

The AI crawler landscape: who's crawling your site and why

AI crawlers generally serve one of three purposes: collecting training data to improve the underlying language model, building a search index to power real-time citations and answers, and fetching specific pages on demand when a user triggers a live retrieval. The major platforms deploy separate bots for each purpose, though not all platforms cover all three. In general, AI crawlers run newer infrastructure with fewer resources and less reliable or complete crawl behaviour than Googlebot.

Google sits in a different category entirely. Rather than building new crawl infrastructure from scratch, Google's AI products run on top of Googlebot, the same crawler that has powered Google Search for decades. This means Google’s AI Overviews, AI Mode and Gemini all inherit Googlebot's full rendering pipeline, its adaptive crawl scheduling, its redirect-handling sophistication, and its 25-year head start in crawl optimization. It is the only AI system that executes JavaScript, follows meta refresh redirects and maintains a continuously updated index.

Training Search indexing User-triggered retrieval
OpenAI GPTBot OAI-SearchBot ChatGPT-User
Anthropic ClaudeBot Claude-SearchBot Claude-User
Perplexity PerplexityBot Perplexity-User
Meta Meta-ExternalAgent
Google Googlebot Google-Agent

For SEOs and web managers, understanding which type of bot is visiting your site matters because each has different crawl frequency, different implications for your content's visibility and can be controlled independently by robots.txt.

More importantly, the redirect and URL infrastructure that works reliably for Googlebot cannot be assumed to work the same way for GPTBot, ClaudeBot or PerplexityBot. These are younger, less sophisticated systems that crawl less frequently, handle errors less gracefully and have no equivalent to Google Search Console for diagnosing problems. Understanding which bots are visiting your site, and for what purpose, is the starting point for any serious AI visibility strategy.

None of the major LLM crawlers follow JavaScript redirects (except GoogleBot)

Vercel analyzed 1.3 billion fetches by GPTBot, ClaudeBot, PerplexityBot and others. For perspective, these bots combined accounted for a little over 28% of Googlebot's volume.

The data indicates that while ChatGPT and Claude crawlers do fetch JavaScript files (ChatGPT: 11.50%, Claude: 23.84% of requests), they don't execute them. They can't read client-side rendered content.

When it comes to AI crawlers that never even attempt JavaScript rendering, they simply do not see JS redirects. The source page remains indexed, even if it has no content, and the intended destination is ignored.

Google is in a fundamentally different position from every other LLM crawler because it reuses its existing search infrastructure.

Google's Gemini leverages Googlebot's infrastructure, enabling full JS rendering. But Google’s documentation still strongly prefers server-side redirects (301 or 302) to JavaScript redirects. If JavaScript generates the redirect, Google first has to render the JS and see what it does and then see the redirect and follow it. It works, but takes longer and can use more crawl budget than necessary.

Major LLM crawlers don’t follow meta refresh redirects (again, except GoogleBot)

There is less empirical data specifically testing how LLM crawlers handle meta refresh redirects. But the existing body of evidence strongly suggests AI crawlers do not follow meta refresh redirects.

A meta refresh redirect lives in the HTML <head>. Despite looking like an HTML element, it is functionally a client-side directive. The browser or user agent must parse the HTML, read the meta tag, interpret the relevant attribute, extract the destination URL and then initiate a new HTTP request. It is not an HTTP-level redirect.

Processing a meta refresh requires the crawler to act on the HTML it downloads. That's a different capability from parsing HTML text. Most lightweight HTTP crawlers (which is what LLM crawlers appear to be, based on the no-JavaScript-execution finding) don't implement meta refresh logic because it's browser behavior, not HTTP-layer behavior.

While no large-scale research finding has investigated this, case studies have confirmed this behavior. Benson SEO’s analysis of 14 days of server logs found that AI bots do not execute any front-end scripts like a standard browser or Google's rendering engine. AI crawlers prioritize full-text retrieval over traditional web rendering,

Comparatively, Google does follow meta refresh redirects and interprets some meta refresh redirects as permanent redirects. Google's Gemini inherits Googlebot's full browser-based rendering pipeline, so it can handle meta refresh redirects. No other LLM crawler has this infrastructure.

Major AI crawlers, such as ChatGPT and Claude, frequently attempt to fetch outdated or non-existent assets, indicating a significant need for improved URL selection and handling strategies.

Vercel's analysis highlights this issue:

  • ChatGPT spends 14.36% of its fetches following redirects and 34.82% hitting 404 pages.
  • Claude similarly spends 34.16% of its fetches on 404 pages.

These rates of redirects and 404s are comparatively high when contrasted with more mature infrastructure like Googlebot, which has spent more time optimizing its crawler to target valid resources. Googlebot's fetch rates are significantly lower: 8.22% of fetches are for 404s and 1.49% hit redirects.

Not only do AI crawlers tend to crawl more 404s than Google, but AI models also cite more 404 pages than Google. The team at Ahrefs analyzed the http status of 16 million unique URLs cited by ChatGPT, Perplexity, Copilot, Gemini, Claude and Mistral.

Their research found that AI models send visitors to 404 pages 2.87x more often than Google Search. ChatGPT was the worst performer, with 1.01% of clicked URLs and 2.38% of all cited URLs returning a 404 status (compared to baseline 404 rates of 0.15% and 0.84%, respectively, within Google Search).

Additionally, observed server log data shows AI crawlers have historically underused sitemaps with significant inconsistency. In one case study, GPTBot fetched a sitemap every month on one site and skipped it entirely on another. Meanwhile, ClaudeBot hit the same sitemap multiple times. There was no predictable pattern across either bot.

A more recent case study found that both ClaudeBot and GPTBot started requesting sitemap.xml for the first time on the same day, from different companies. This suggests these platforms may be taking a new approach to content discovery–one that uses and respects sitemaps.xml files.

What’s happening here? Likely a combination of factors. AI models are known to hallucinate from time to time, and this extends to cited URLs as well. Additionally, AI crawlers don’t crawl as often as Google does, which means their content becomes stale and outdated more quickly. And if they have historically ignored sitemap.xml that would prevent them from discovering new content and removing old, outdated pages.

All in all, the data suggests that major AI crawlers don’t have an up-to-date index of content on your website. Even when they aren’t hallucinating, both 404s and 301s are not working as effective signals to AI crawlers to update their content indexes. Keeping proper 301 redirects in place for longer is likely required, since AI models will continue to serve older content for longer.

There is no publicly documented hop limit for any non-Google LLM crawler

None of OpenAI's, Anthropic's or Perplexity's official documentation specifies a redirect chain limit. And there’s no documented third-party research. This simply hasn't been systematically tested and published the way Googlebot's behavior has been.

We can infer that the same general principles apply to AI crawlers as to Googlebot: search engines follow redirect chains but impose limits. Crawlers must track visited URLs in each chain to detect and break redirect loops. Without loop detection, a crawler requesting URL A that redirects to B, which redirects back to A, would run indefinitely. Any competent crawler implementation includes loop detection, but the specific hop ceiling for LLM crawlers is simply undocumented publicly.

Google is uniquely transparent here. More is known about Google’s crawl infrastructure because of its maturity in the market. Search engine leaders have been running tests on Google’s infrastructure for decades. And Google has a long and public record of statements about how Googlebot works.

Google's John Mueller has advised site owners to aim for fewer than 5 hops in a 301 redirect chain. However, Googlebot’s documentation cites a higher ceiling, saying Google can follow up to around 10 redirect hops. Regardless of the specific ceiling, on any large site, redirect chains add delay and can prevent some pages from being crawled and indexed effectively.

Because the major AI companies have less mature crawl infrastructure, we can infer that AI crawlers have much stricter processing limitations than Googlebot. At urllo we advise our customers that a redirect chain with even just two hops can add redirect latency which significantly raises the chance of the crawler abandoning the request before it ever reaches your desired content.

We recommend avoiding redirect chains for this reason and going straight to the final URL. There are some circumstances where a redirect chain can make sense, for example using redirects to perform HTTPS upgrading or apex to www redirects. But there are almost no circumstances where more than two or three hops are required.

To test your redirects for chains and slow response time, a free redirect checker tool.

How do AI crawlers treat 301, 302, 307 and 308 status codes?

The short answer is: we don’t know. There’s essentially no public data on how LLM crawlers treat different redirect status codes.

For LLM crawlers that are fundamentally doing GET requests to fetch HTML content, 307 and 308 redirects likely make no difference. They're not submitting forms or API calls. The distinction matters more in agentic use cases where an agent might POST to something and hit a redirect.

One word of caution: 307 and 308 are relatively new and therefore not supported by all browsers and crawlers. It’s therefore possible that 307 and 308 redirects do interfere with AI crawlers. We simply don’t know one way or the other. But due to the limited support of 307 and 308 redirects, we at urllo always recommend using 301 and 302 redirects unless you require POST requests. This ensures your users and crawlers will be redirected properly and are certain that all clients understand the redirect response code.

How often do AI crawlers actually crawl your website?

Vercel found that Googlebot is ~3.5× larger by crawl volume than all major LLM crawlers combined. For perspective, GPTBot, Claude, AppleBot and PerplexityBot combined account for nearly 1.3 billion fetches. That’s a little more than 28% of Googlebot's volume. More recent research from Cloudflare found that GoogleBot, GPTBot, Claudebot and Meta-ExternalAgent all continue to increase their share of crawl traffic.

But that's across the entire Internet. At the individual site level, what a site owner actually experiences is different. And again, Googlebot behaves very differently from the major AI crawlers.

Google has the largest and most mature crawl infrastructure on Earth, and performs a continuous crawl of the Internet. Google's system is highly adaptive and site-dependent, and doesn’t treat all websites and content the same. Major news outlets might see Googlebot visiting multiple times per day, whilst smaller or less frequently updated websites could wait weeks between crawls. Most established websites with regular content updates can expect Google to crawl their pages every few days to a week.

Unlike Googlebot’s steady incremental visits, AI crawl infrastructure occurs in sporadic, large bursts. One study of 48 days of server logs found that GPTBot was largely absent, then executed 152 requests in a 3-minute burst, then more or less disappeared again.

Why do AI crawlers crawl in bursts? AI crawlers are trying to accomplish a very different goal than Googlebot. Cloudflare found that 80% of AI crawling was for training models, which is expensive and infrequent. Training crawlers like GPTBot and ClaudeBot don't need to recrawl pages continuously because they're building a static dataset for a future model version. They sweep a site comprehensively once (or once per retraining cycle), then leave. Search/retrieval crawlers (OAI-SearchBot, PerplexityBot, ChatGPT-User) recrawl more frequently but are triggered by user queries, not internal schedules.

What does this mean? Your website has fewer chances to be indexed by AI crawlers like ChatGPT, Claude or Perplexity. If your goal is to be visible and mentioned in AI, it’s critical to maintain proper server-side redirects with few redirect chains.

Key takeaways for web managers and SEOs

Use server-side redirects. ChatGPT and Claude don't execute client-side redirects (such as JavaScript redirects or meta refresh redirects). This means that any content linked to via JavaScript or meta refresh redirects is effectively invisible.

Because AI crawlers don’t crawl continuously, AI models have a “stale” index of content and will continue to crawl and serve old, outdated URLs on your website. This means keeping proper 301 redirects in place for longer is more important.

AI crawlers are less patient than Googlebot. They abandon pages more quickly, and retry less frequently. That means the bar is higher to drive visibility in AI mentions as opposed to traditional Google search results.

Lastly, maintaining efficient URL management is crucial. The high 404 error rates seen with AI crawlers show the importance of maintaining redirects, sitemaps and consistent URL patterns throughout your websites and domains.

Frequently asked questions about redirects and AI crawlers

Do AI crawlers follow JavaScript redirects?

No. Major AI crawlers like GPTBot (ChatGPT), ClaudeBot (Claude), PerplexityBot do not execute JavaScript, which means they cannot follow JavaScript redirects. The source page remains indexed and the destination is ignored entirely. Google's Gemini is the only exception, since it inherits Googlebot's full rendering pipeline. Even so, Google's own documentation recommends server-side redirects over JavaScript redirects because they're faster and consume less crawl budget.

Do AI crawlers follow meta refresh redirects?

No. Although meta refresh tags are written in HTML, they require a browser-like behavior to act on. Parsing the tag, extracting the destination URL and initiating a new request is outside the scope of AI crawlers. AI crawlers behave like lightweight HTTP clients and don't implement this logic. Google is again the exception.

Which redirect status codes should I use for AI crawlers?

Stick to 301 and 302. There is no published data on how LLM crawlers treat 307 or 308 status codes, and since these codes are newer, support is not guaranteed across all crawler implementations. For standard page redirects (involving GET requests) the 301/302 distinction matters most. Use 301 for permanent moves and 302 for temporary ones.

Do LLM crawlers follow 301 vs. 302 differently at the HTTP level?

There is no controlled, published research testing this directly. No AI company has documented how their crawlers treat different redirect status codes, and no third party has run systematic experiments comparing 301 and 302 handling across GPTBot, ClaudeBot or PerplexityBot. Both redirect types are almost certainly followed since that’s what crawlers do, but whether the semantic distinction between permanent and temporary redirects influences AI crawl behavior downstream is simply unknown. This is a meaningful gap in the public research, particularly given how well-documented Google's behavior on this question is.

Do 307 and 308 redirects matter for LLM crawlers?

Almost certainly not. The purpose of both codes is method preservation, that is ensuring a POST request stays a POST through the redirect. For LLM crawlers performing simple GET requests to fetch HTML, there is no method to preserve. So 307 behaves identically to 302 and 308 behaves identically to 301.

The one caveat is that both are newer status code with no published confirmation that GPTBot, ClaudeBot, PerplexityBot or others handle it correctly. Given that uncertainty, and the fact that method preservation offers no benefit in standard crawling contexts, 301 and 302 remain the safer and more universally supported choice.

How many redirect hops will AI crawlers follow?

Unknown. None of OpenAI, Anthropic or Perplexity has published a documented hop limit, and no third-party research has tested this systematically. Google generally advises site owners to keep redirect chains shorter than 5-10 hops. Because AI crawlers are generally operating with more crawl constraints than Google, it’s safe to assume AI crawlers follow fewer hops. You can use a redirect checker to test your redirects for chains and slow response time.

Why do AI crawlers hit so many 404 pages?

AI crawlers have less mature URL selection logic, crawl less frequently and have historically underused sitemaps. This means their indexes go stale and they continue requesting URLs that no longer exist.

How often do AI crawlers visit my site?

Much less often than Googlebot, and in a very different pattern. Googlebot performs a continuous, adaptive crawl, visiting major sites multiple times per day and most established sites every few days to a week. AI training crawlers like GPTBot and ClaudeBot crawl in sporadic bursts. This is because roughly 80% of AI crawling is for model training, which happens infrequently. The practical consequence is that AI models maintain a staler index of your site than Google does.

Should I keep 301 redirects in place longer than I would for Google?

Yes. Because AI crawlers visit less frequently and have historically ignored sitemaps, they continue serving older, cached versions of your content well after you've made changes. A 301 redirect that you might safely retire after a few months for Google's benefit should be kept active longer to account for AI crawlers that may not have re-indexed the destination yet.

Do AI crawlers use my sitemap.xml?

Historically, inconsistently and rarely. However, more recent data shows both GPTBot and ClaudeBot may be starting to request sitemap.xml regularly. This suggess these platforms are starting to take a more structured approach to content discovery. Keeping your sitemap current and accurate is increasingly important.

Close-up of excavator tracks on a construction site with the text "Redirects vs. AI Crawlers" overlaid in bold green and purple typography.

By Matt Hayles

VP, Revenue

Matt brings over 15 years of experience in building brands, acquiring customers and scaling revenue for growth-stage companies. An avid runner, hiker, skier and backpacker, he likes to explore the outdoors with his family.

Get expert content to help optimize your redirects