← Blog
Guide 8 min read

How to Measure Your AI Share of Voice (Without Guessing)

Checking ChatGPT once is an anecdote, not a measurement. Here is how to turn AI search visibility into a real number you can track week over week.

The Editorial Team

Most teams “measure” their AI visibility like this. Someone opens ChatGPT, types “best [category] tool,” reads the answer, and feels a little jolt of dread or relief depending on whether the brand showed up. Then they report it in the standup as if it meant something.

It doesn’t. That’s a vibe, not a measurement. And it will lie to you in both directions: it’ll panic you on a bad roll and lull you on a lucky one.

Measuring whether AI answer engines mention, recommend, and cite your brand is a statistics problem. Treat it like one and you get a number you can move. Treat it like a coin flip and you get noise dressed up as insight.

Why a Single Check Lies

AI answers are non-deterministic. Ask the same model the same question twice and you can get two different answers, with different brands named in a different order. That isn’t a glitch. It’s how these systems work.

Now stack the other variables on top. A logged-in account with memory and past chats gets different answers than a fresh one. Someone searching from London sees different results than someone in Austin. Run the query at 9am versus after a model update and the ground has shifted again. Each model, ChatGPT, Claude, Gemini, Perplexity, Grok, plus Google’s AI Overviews, has its own habits about who it surfaces.

So when one person checks one prompt from one account once, they’ve drawn a single card from a deck and announced the whole hand. The honest answer to “did we show up” is “sometimes, under conditions we didn’t record.” Useless for tracking. Worse than useless if you make decisions on it.

The fix is the same fix every pollster uses. Stop reading one answer. Start measuring a distribution.

Define a Real Prompt Set

The first mistake is testing five prompts you made up. Your buyers don’t ask your five. They ask hundreds of phrasings, and the wording changes which brands get named.

So write down the questions people actually ask on the way to buying what you sell. Not keywords. Real prompts, the way a human talks to an assistant. You’ll usually land somewhere in the range of 50 to 150 of them before the set feels honest, because the same buying question hides behind a dozen wordings.

Then sort them by intent, because each kind tells you something different:

  • Category prompts. “Best tools for X,” “top vendors for Y.” This is pure discovery. Do you exist in the conversation at all?
  • Comparison prompts. “X vs Y,” “alternatives compared.” Now you’re being weighed against a named rival.
  • Alternative prompts. “Alternatives to [competitor].” Are you the answer when someone is already shopping a competitor?
  • Branded prompts. “Is [your brand] any good,” “what does [your brand] do.” When AI describes you directly, does it get you right?

A brand can be invisible in category prompts but described accurately in branded ones. That gap is a strategy, not a footnote. You can’t see it without the map.

Sample Properly

Once you have the prompt set, you run it like an experiment, not a spot check.

Each prompt gets run repeatedly, not once. Across every engine that matters to your buyers. Across fresh and logged-in accounts. Across the geographies you actually sell into. One pass is an anecdote. Many passes is a distribution, and the distribution is the truth.

This is the part people skip because it’s tedious, and it’s exactly the part that makes the number trustworthy. Run a prompt ten times and get named in seven, you have a presence rate of 70 percent for that prompt. Run it once and get named, you have a coin that landed heads. The repetition is the measurement.

Score What Matters

“Did we show up” is too blunt. Being named last in a list of nine isn’t the same as being the first recommendation, and being described as “a budget option” isn’t the same as “the category leader.” Break it into things you can actually score.

What to scoreThe question it answers
PresenceAcross all those runs, how often are you named at all?
PositionWhen you appear, how prominently? First pick or buried in a list?
SentimentHow are you described? Recommended warmly, or cited with a caveat?
CompetitorsWho shows up next to you, and how often do they beat you?

Presence is your floor. Position and sentiment are the quality of the mention. The competitor column is the one people forget, and it’s the most useful, because AI visibility is relative. You’re not trying to be visible in a vacuum. You’re trying to be the brand it reaches for instead of the other guy.

Roll It Into One Number You Can Track

Now collapse all of it into a share-of-voice figure: across your prompt set, across engines and runs, what slice of the available recommendations did you win versus everyone else competing for the same answers.

The number itself matters less than the trend. A single reading is still a snapshot, and snapshots lie. Track it weekly. Watch the line. Up and to the right means the work is landing. Flat after a campaign means it isn’t, no matter how good the campaign felt. A drop after a model update means the engines reshuffled and you need to respond.

This is the whole point. “Are we visible in AI” stops being a feeling someone reports nervously and becomes a line on a chart you can defend, attack, and move.

Watch the Leading Indicators

The headline share-of-voice number is a lagging indicator. It moves last.

The early signals move first. You’ll see yourself start appearing in branded and comparison prompts before you crack the competitive category prompts. You’ll get cited as a source before you get recommended as a pick. You’ll climb from “mentioned in passing” to “named in the shortlist” weeks before the top-line number reflects it. Watch those, and you’ll know the work is working before the big number admits it.

Where This Leaves You

We built SuperVouch around this rig because we got tired of clients asking “are we showing up” and having no honest answer beyond “we checked, looked fine.” So we built the measurement: real prompt sets, repeated sampling across engines and geographies, scored for presence, position, sentiment, and competitors, rolled into a share-of-voice number we move on purpose. Then we do the work to move it.

Want to see where you actually stand instead of guessing? Get a free AI visibility audit and we’ll show you the real distribution, or book a call and we’ll walk through what’s moving your number and what isn’t.

See where you stand in AI search.

Free audit. We'll show you which answers you're missing from.

Get a Free Visibility Audit