How to Build an AI Visibility Prompt Set

AI visibility tracking starts with the prompt set.

If the prompts are too broad, the report will mostly tell you which big brands dominate the category. If the prompts are too narrow, the report may flatter you without showing the real market. If the prompts are random, the results will be random too.

A good prompt set is not a list of keywords rewritten as questions. It is a structured way to see how AI systems mention, compare, cite, and recommend brands when people research your category.

This guide explains how to build one.

Start with the decision, not the prompt

The first question is not “what should we ask ChatGPT?”

The first question is: what do we need to learn?

For a SaaS company, the useful questions might be:

Do we appear when people ask for tools in our category?
Which competitors make the shortlist instead of us?
Does AI understand our best use cases?
Does AI describe our product accurately?
Which sources shape the answer?
Do we appear for comparison and alternative questions?

For an ecommerce brand, the questions may be different:

Does AI recommend our products for the right use cases?
Which competing brands appear more often?
Does AI cite review sites, Reddit threads, YouTube videos, or publisher roundups?
Does it describe the product correctly?
Do AI-assisted visits turn into orders?

The prompt set should follow the decision. Otherwise, the report can look organized and still fail to answer anything useful.

Separate branded and unbranded prompts

One of the easiest ways to misread AI visibility is to mix all prompts together.

Branded prompts and unbranded prompts answer different questions. If someone asks “what does [brand] do?” the system already has your name. That checks accuracy and understanding. If someone asks “best tools for [use case],” the system has to decide whether you belong in the answer at all. That checks discovery and recommendation visibility.

Use at least four prompt types:

Prompt type	What it tells you	Example
Discovery	Whether AI finds you without your brand name	Best AI visibility tools for agencies
Comparison	How AI explains tradeoffs	SurfacedBy vs other AI visibility tools
Alternative	Whether you appear around competitors	Best alternatives to [competitor]
Branded accuracy	Whether AI describes you correctly	What does [brand] do?

A brand can look strong on branded accuracy prompts and still be invisible in unbranded discovery prompts. That is why the split matters.

Use prompt categories, not one giant list

A strong AI visibility prompt set has categories. Each category shows a different part of how AI systems understand the market.

Category prompts show whether your brand appears when someone asks for options in the market.

What are the best project management tools for remote teams?
What are the best AI visibility tools for SaaS companies?
Which email marketing platforms are best for ecommerce?

Problem prompts show whether you appear when the person describes the pain instead of the category.

How can I find out if ChatGPT recommends my competitors?
How do I reduce failed payments in a subscription business?
How can a small marketing team track content performance without a huge SEO stack?

Use-case prompts test whether AI connects your brand to the situations where you are strongest.

Best CRM for a 15-person B2B SaaS team
Best analytics tool for a WooCommerce store
Best AI search visibility platform for agencies

Comparison prompts show how AI explains differences between brands.

HubSpot vs Salesforce for a small SaaS company
Ahrefs vs Semrush for content teams
[brand] vs [competitor]

Alternative prompts reveal competitor gravity.

Best alternatives to [competitor]
Tools like [competitor] for agencies
Cheaper alternatives to [competitor]

Objection prompts test whether AI understands the concerns that come up before someone chooses.

Is [brand] good for small teams?
Is [brand] worth the price?
What are the limitations of [brand]?
Which [category] tools are easiest to set up?

Integration prompts matter when the product is chosen because it works with a specific stack.

Best subscription analytics tool for Stripe and WooCommerce
Best CRM that integrates with Slack and Gmail
Best AI visibility tracker for WordPress sites

Accuracy prompts check what AI says when it already knows the brand name.

What does [brand] do?
Who is [brand] best for?
What are [brand]’s main features?
How much does [brand] cost?

These categories make the results easier to interpret. If you are missing from category prompts but present in branded accuracy prompts, the problem is not brand understanding. It is discovery. If comparison prompts describe you incorrectly, the problem may be positioning, source quality, or outdated third-party evidence.

Add constraints that change the answer

Broad prompts are useful, but they are rarely enough.

People do not always ask “best CRM.” They ask for the best CRM for a small sales team, a nonprofit, a startup, an agency, a specific budget, or a specific integration. The constraint changes the answer.

Useful constraints include:

Company size
Industry
Region
Budget
Technical skill
Existing tools
Use case
Urgency
Compliance or security needs

A generic category prompt may surface the biggest brands. A constrained prompt may surface the brands that actually fit the situation. That is usually the more useful visibility question.

Start with a small prompt matrix

You do not need 500 prompts on day one.

Start with a prompt matrix that is large enough to show patterns and small enough to review carefully. For most teams, 30 to 50 prompts is a better starting point than a huge list no one can interpret.

A practical starter set could look like this:

Prompt group	Starter count	Purpose
Category	5	See who appears for broad market questions
Problem	5	Test pain-based discovery
Use case	5	Check fit for your strongest scenarios
Comparison	5	See how AI explains tradeoffs
Alternative	5	Check competitor replacement visibility
Accuracy	5	Find wrong or outdated brand descriptions

Then expand based on what you learn. If comparison prompts reveal the most competitor movement, add more comparisons. If use-case prompts show weak visibility, go deeper into specific industries, segments, and constraints.

The best prompt set is not the biggest one. It is the one people actually use to make decisions.

Track competitors intentionally

A prompt set that only checks your own brand will miss the most useful part of AI visibility tracking.

You need to know who appears instead of you.

For each important prompt category, track:

Which competitors are mentioned
Which competitors are recommended
Which competitor appears first
Which competitors are described most clearly
Which competitors get cited sources
Which competitors appear for use cases you want to own

This is where the report becomes strategic. A competitor that appears once may not matter. A competitor that appears across category, use-case, comparison, and objection prompts is probably winning the answer layer for a reason.

Record the answer, not just the mention

Mentions are easy to count. They are not always meaningful.

A prompt report should capture enough detail to explain what happened and what should change next.

Field to record	Why it matters
Prompt	Shows the exact question being tested
AI system	ChatGPT, Perplexity, Gemini, Claude, and Google AI surfaces can differ
Date	Answers change over time
Brand mentioned	Basic presence
Recommendation strength	A passing mention is not the same as a recommendation
Competitors mentioned	Visibility is relative
First brand mentioned	Shows answer prominence
Cited sources	Reveals the evidence layer
Accuracy issues	Wrong visibility can hurt
Action needed	Keeps the report useful

This keeps the report from becoming a vanity dashboard. A prompt that shows your brand but describes it wrong is not a win. A prompt that omits you but cites three competitor comparison pages is not just a visibility problem. It is a source gap.

Do not over-trust one answer

AI answers are not fixed rankings. The same prompt can vary by model, retrieval behavior, location, date, wording, browsing state, and sampling.

A recent arXiv paper on LLM search visibility argues that visibility should be treated as an estimate from a response distribution, not a fixed number from a single run. That framing is useful because it keeps the report honest: one answer is a clue, not a measurement.

For important prompts, track changes over time. Where possible, repeat prompts and look for patterns instead of treating a single screenshot as proof.

The question is not “did we appear once?” The question is “do we appear reliably enough that the pattern is worth acting on?”

Use fan-out as a coverage check

AI search systems may expand a question into related questions, subtopics, constraints, and follow-ups. Google’s AI search documentation describes query fan-out as part of AI Search, and Search Engine Land has covered fan-out as a way to find missing subtopics and structural gaps. Google’s guidance and Search Engine Land’s fan-out guide point to the same practical idea: one prompt often represents a wider question set.

Do not use fan-out as an excuse to create hundreds of thin pages. Use it to improve your prompt set and your content coverage.

For each main prompt, ask:

What follow-up questions would someone ask?
What comparisons are implied?
What constraints change the answer?
What objections need to be resolved?
What source types would AI likely need to answer well?

That gives you better prompts and better content priorities.

Avoid prompt-set mistakes

Bad prompt sets usually fail in predictable ways.

Only tracking branded prompts: this checks accuracy, not discovery.
Only tracking broad category prompts: this often favors the largest brands and misses use-case fit.
Changing prompts every week: this makes trends impossible to read.
Never changing prompts: this ignores market changes, new objections, and new competitors.
Rewriting prompts until the answer looks good: this creates a flattering report, not a useful one.
Counting mentions without reading the answer: this misses wrong descriptions, weak recommendations, and source gaps.

The goal is not to design prompts that make the brand look good. The goal is to design prompts that reveal what AI systems are likely to say when people research the category.

Refresh the prompt set as the market changes

A prompt set is not permanent.

Refresh it when:

You launch a new feature
You enter a new category
A competitor changes positioning
Customers start asking different questions
Google AI Mode, ChatGPT, Perplexity, Gemini, or Claude changes behavior
New citations or source gaps appear
Your sales team hears a new objection repeatedly

Keep enough consistency to measure change. Add new prompts when the market changes. Both matter.

Where SurfacedBy fits

SurfacedBy helps teams track how AI systems mention, cite, compare, and recommend their brand across important prompts.

The prompt set is the foundation. Once the right questions are being tracked, the useful work is seeing which competitors appear, which sources shape the answer, whether the answer is accurate, and what should be improved next.

That is the difference between prompt tracking as a screenshot exercise and prompt tracking as an AI visibility system.

The bottom line

A good AI visibility prompt set should reflect how people actually research, compare, question, and choose.

Do not track prompts just because they sound important. Track prompts because the answers would change a decision.

Separate branded and unbranded prompts.
Use categories, not a random list.
Add real constraints.
Track competitors intentionally.
Record recommendation strength, sources, and accuracy.
Repeat important prompts over time.
Use fan-out to find coverage gaps.
Keep the set small enough to act on.

The best prompt set is not the biggest one. It is the one that helps you see where AI systems already understand you, where they prefer competitors, and what needs to improve.