Why Two Tools Report Different AI Visibility Numbers (API vs Scraping, Explained)

Run two AI visibility tools on the same store and you can get two different numbers. The reason is method. Here is how API, scraping, and manual checks differ, and how to know which number to trust.

Key takeaways

  • Two AI visibility tools can report different numbers for the same store because they measure differently.

  • Scraping usually captures the logged-out experience, which is not what most real shoppers see.

  • APIs are more consistent and repeatable, but must be configured to mirror real user behavior.

  • Personalization and sampling both move the number, so method and transparency matter more than the headline figure.

Run two AI visibility tools on the same store, on the same day, and you can get two different numbers. One says your share of voice is forty percent, the other says twenty. Neither vendor is lying. They are measuring different things, in different ways, and the method quietly decides the result. If you are going to make decisions off an AI visibility number, you need to understand what is behind it. This guide explains the three ways AI visibility gets measured, why they disagree, and how to tell which number to trust.

The three ways AI visibility gets measured

Every tool, and every manual effort, uses one of three approaches. Each has real strengths and real blind spots.

1. Manual spot-checks

You open ChatGPT, ask a question, and note whether your brand appears. It is free and intuitive, and it is how most teams start. The problem is that it does not scale and it does not repeat. AI answers vary from one run to the next, so a single manual check tells you what happened once, not how often you appear. It is a useful gut check and a poor measurement.

2. Interface scraping

Many tools work by automating the public chatbot interface, capturing what the answer shows. This scales better than manual checks, but it carries a blind spot most buyers never hear about. Scraping almost always captures the logged-out experience, and logged-out sessions often run on different, sometimes older, model behavior than the logged-in experience most shoppers actually use. So a scraped number can over-represent a narrow slice of reality and miss how the tool behaves for a typical signed-in buyer.

3. Model APIs

Other tools query the model through its API. This is repeatable and testable, you can run the same prompts the same way and trust the comparison over time. The catch is that a raw API call is not automatically the same as the consumer interface, which adds its own instructions, memory, and context. An API approach is only as good as how carefully it is configured to mirror what real users experience. Done well, it is the most consistent method. Done carelessly, it measures something no shopper ever sees.

The hidden variable most numbers ignore: personalization

Here is the part that surprises people. The same question, asked by two different shoppers, can return different brands. AI answers are shaped by whether the user is logged in, their location, their language, their past conversations, and even the time of day. Traditional search has a small personalization effect; in AI interfaces it can be much larger.

This is why a single scraped, logged-out number is incomplete. It describes one anonymous user in one location, not the range of real buyers you care about. A credible measurement either models several realistic user segments or is honest that it represents one specific viewpoint. Beware any tool that reports a single confident number with no mention of whose experience it reflects.

The other hidden variable: sampling

Even with a perfect method, one reading is not the truth. Because AI answers vary run to run, your real visibility is a probability you estimate by sampling, not a fact you read once. Ask a question forty times and your brand might appear in twelve answers; ask it again tomorrow and you might get fourteen. Without enough checks, the difference between two tools, or between this week and last, can be pure noise. We cover exactly how many checks you need before a number is trustworthy in our guide to measuring AI visibility with confidence.

So which number should you trust?

Trust the number whose method is transparent and matches the decision you are making. There is no single correct figure, because AI visibility genuinely differs by user, engine, and moment. What separates a reliable measurement from a misleading one is not the headline percentage. It is whether the method is clear, repeatable, broad enough to be statistically sound, and honest about what slice of reality it represents.

This is the same reason BrandOcto leans on open, inspectable signals rather than a black box. A number you cannot question is a number you cannot act on with confidence.

How to measure AI visibility credibly

Whether you build it yourself or buy a tool, the principles are the same.

  • Use a real prompt set, not one query. Aim for dozens of prompts across branded, category, comparison, and use-case questions.

  • Cover multiple engines. ChatGPT, Gemini, Perplexity, and Google AI Overviews name different brands, so one engine is not the story.

  • Sample enough to beat noise. Run prompts repeatedly on a steady cadence and watch the trend, not a single reading.

  • Measure per funnel stage. A blended number hides where you actually lose, which is what AI funnel tracking exists to show.

  • Demand transparency. Know whether the data is from scraping or API, logged in or out, and which user it represents.

Questions to ask before you trust any AI visibility number

Whether the number comes from a tool or your own team, ask these before you act on it:

  • Is this from interface scraping or a model API, and why that choice?

  • Does it reflect a logged-in or logged-out experience?

  • How many prompts and how many runs is the number based on?

  • Which engines are included, and are they reported separately?

  • Is it one blended score, or broken down by funnel stage and competitor?

  • Can I see the underlying prompts and sources, or is it a black box?

If a vendor cannot answer these, the number is decoration, not measurement. Once you trust the number, turn it into action by ranking fixes with the Action Layer, and read the metric itself in our guide to AI share of voice.

Who needs to care about this

If you are choosing an AI visibility tool, or reporting AI numbers to a client or a boss, the method behind the number is your job to understand, because you will be held to it. If you are just checking whether you show up at all, start simpler with an AI reachability audit, then graduate to structured measurement once there is something worth tracking precisely.

Frequently asked questions

Why do two AI visibility tools give different numbers?

Because they measure differently. One may scrape the logged-out chatbot interface while another queries the model API, and they may use different prompts, engines, and sample sizes. Each choice changes the result, so the numbers are not directly comparable unless the methods match.

Is API or scraping better for measuring AI visibility?

Neither is automatically better. Scraping captures what the public interface shows but usually only the logged-out view. APIs are more consistent and repeatable but must be configured to mirror real user behavior. The better tool is the one whose method is transparent and fits the decision you are making.

Does personalization really change AI answers that much?

Yes. Login status, location, language, past conversations, and timing can all change which brands an AI names. That is why a single logged-out number is incomplete, and why good measurement models several realistic user segments or states clearly which one it represents.

How many checks do I need for a trustworthy number?

More than one. Because answers vary run to run, you need to sample across many prompts and repeat them over time. Our guide to measuring AI visibility with confidence walks through the specific sample sizes for different levels of precision.

What is the single most important thing to check in a tool?

Transparency. A tool that tells you its method, its prompt set, its engines, and the user it represents lets you trust and question the number. A black-box score that cannot explain itself cannot be acted on with confidence.

Related guides

Go deeper with how much data you need to measure AI visibility with confidence, learn the core metric in AI share of voice, and score your store with the AI search readiness checklist.

About the author

Chirantan Mungara writes about AI search visibility and generative engine optimization for ecommerce teams at BrandOcto, focused on how AI engines like ChatGPT choose and recommend products. Connect on LinkedIn.

Measure your AI visibility the honest way

BrandOcto runs structured, multi-prompt, multi-model checks so your Share of Voice number comes with real confidence, not a single coin flip.

Measure your AI visibility the honest way

BrandOcto runs structured, multi-prompt, multi-model checks so your Share of Voice number comes with real confidence, not a single coin flip.

Measure your AI visibility the honest way

BrandOcto runs structured, multi-prompt, multi-model checks so your Share of Voice number comes with real confidence, not a single coin flip.