1. SEJ
  2.  ⋅ 
  3. SEO

Are LLM Visibility Trackers Worth It?

Is LLM visibility worth the premium price point these tools have positioned themselves at or is the juice not worth the squeeze?

Are LLM Visibility Trackers Worth It?

TL;DR

  1. When it comes to LLM visibility, not all brands are created equal. For some, it matters far more than others.
  2. LLMs give different answers to the same question. Trackers combat this by simulating prompts repeatedly to get an average visibility/citation score.
  3. While simulating the same prompts isn’t perfect, secondary benefits like sentiment analysis are not SEO-specific issues. Which right now is a good thing.
  4. Unless a visibility tracker offers enough scale at a reasonable price, I would be wary. But if the traffic converts well and you need to know more, get tracking.
(Image Credit: Harry Clarkson-Bennett)

A small caveat to start. This really depends on how your business makes money and whether LLMs are a fundamental part of your audience journey. You need to understand how people use LLMs and what it means for your business.

Brands that sell physical products have a different journey from publishers that sell opinion or SaaS companies that rely more deeply on comparison queries than anyone else.

Or a coding company destroyed by one snidey Reddit moderator with a bone to pick…

For example, Ahrefs made public some of its conversion rate data from LLMs. 12.1% of their signups came from LLMs from just 0.5% of their total traffic. Which is huge.

AI search visitors convert 23x better than traditional organic search visitors for Ahrefs. (Image Credit: Harry Clarkson-Bennett)

But for us, LLM traffic converts significantly worse. It is a fraction of a fraction.

Honestly, I think LLM visibility trackers at this scale are a bit here today and gone tomorrow. If you can afford one, great. If not, don’t sweat it. Take it all with a pinch of salt. AI search is just a part of most journeys, and tracking the same prompts day in, day out has obvious flaws.

They’re just aggregating what someone said about you on Reddit while they’re taking a shit in 2016.

What Do They Do?

Trackers like Profound and Brand Radar are designed to show you how your brand is framed and recommended in AI answers. Over time, you can measure yours and your competitors’ visibility in the platforms.

Image Credit: Harry Clarkson-Bennett

But LLM visibility is smoke and mirrors.

Ask a question, get an answer. Ask the same question, to the same machine, from the same computer, and get a different answer. A different answer with different citations and businesses.

It has to be like this, or else we’d never use the boring ones.

To combat the inherent variance determined by their temperature setting, LLM trackers simulate prompts repeatedly throughout the day. In doing so, you get an average visibility and citation score alongside some other genuinely useful add-ons like your sentiment score and some competitor benchmarking.

“Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.”

OpenAI Documentation

Simulate a prompt 100 times. If your content was used in 70 of the responses and you were cited seven times, you would have a 70% visibility score and a 7% citation score.

Trust me, that’s much better than it sounds… These engines do not want to send you traffic.

In Brian Balfour’s excellent words, they have identified the moat and the gates are open. They will soon shut. As they shut, monetization will be hard and fast. The likelihood of any referral traffic, unless it’s monetized, is low.

Like every tech company ever.

If you aren’t flush with cash, I’d say most businesses just do not need to invest in them right now. They’re a nice-to-have rather than a necessity for most of us.

How Do They Work?

As far as I can tell, there are two primary models.

  1. Pay for a tool that tracks specific synthetic prompts that you add yourself.
  2. Purchase an enterprise-like tool that tracks more of the market at scale.

Some tools, like Profound, offer both. The cheaper model (the price point is not for most businesses) lets you track synthetic prompts under topics and/or tags. The enterprise model gives you a significantly larger scale.

Whereas tools like Ahrefs Brand Radar provide a broader view of the entire market. As the prompts are all synthetic, there are some fairly large holes. But I prefer broad visibility.

I have not used it yet, but I believe Similarweb have launched their own LLM visibility tracker, which includes real user prompts from Clickstream data.

This makes for a far more useful version of these tools IMO and goes some way to answering the synthetic elephant in the room. And it helps you understand the role LLMs play in the user journey. Which is far more valuable.

The Problem

Does doing good SEO improve your chances of improving your LLM visibility?

Certainly looks like it…

GPT-5 no longer needs to train on more information. It is as well-versed as its overlords now want to pay for. It’s bored of ravaging the internet’s detritus and reaches out to a search index using RAG to verify a response. A response, it does not quite have the appropriate level of confidence to answer effectively.

But I’m sure we will need to modify it somewhat if your primary goal is to increase LLM visibility. Increase expenditure on TOFU and digital PR campaigns being a notable point.

Image Credit: Harry Clarkson-Bennett

Right now, LLMs have an obvious spam problem. One I don’t expect they’ll be willing to invest in solving anytime soon. The AI bubble and gross valuation of these companies will dictate how they drive revenue. And quickly.

It sure as hell won’t be sorting out their spam problem. When you have a $300 billion contract to pay and revenues of $12 billion, you need some more money. Quickly.

So anyone who pays for best page link inclusions or adds hidden and footer text to their websites will benefit in the short-term. But most of us should still build things actual, breathing, snoring people.

With the new iterations of LLM trackers calling search instead of formulating an answer for prompts based on learned ‘knowledge’, it becomes even harder to create an ‘LLM optimization strategy.’

As a news site, I know that most prompts we would vaguely show up in would trigger the web index. So I just don’t quite see the value. It’s very SEO-led.

If you don’t believe me, Will Reynolds is an inarguably better source of information (Image Credit: Harry Clarkson-Bennett)

How You Can Add Value With Sentiment Analysis

I found almost zero value to be had from tracking prompts in LLMs at a purely answer level. So, let’s forget all that for a second and use them for something else. Let’s start with some sentiment analysis.

These trackers give us access to:

  • A wider online sentiment score.
  • Review sources LLMs called upon (at a prompt level).
  • Sentiment scores by topics.
  • Prompts and links to on and off-site information sources.

You can identify where some of these issues start. Which, to be fair, is basically Trustpilot and Reddit.

I won’t go through everything, but a couple of quick examples:

  1. LLMs may be referencing some not-so-recently defunct podcasts and newsletters as “reasons to subscribe.”
  2. Your cancellation process may be cited as the most serious issues for most customers.

Unless you have explicitly stated that these podcasts and newsletters have finished, it’s all fair game. You need to tighten up your product marketing and communications strategy.

For people first. Then for LLMs.

These are not SEO specific projects. We’re moving into an era where solely SEO projects will be difficult to get pushed through. A fantastic way of getting buy-in is to highlight projects with benefits outside of search.

Highlighting serious business issues – poor reviews, inaccurate, out-of-date information et al. – can help get C-suite attention and support for some key brand reputation projects.

Profound’s sentiment analysis tab (Image Credit: Harry Clarkson-Bennett)
Here it is broken down by topic. You can see individual prompts and responses to each topic (Image Credit: Harry Clarkson-Bennett)

To me, this has nothing to do with LLMs. Or what our audience might ask an ill-informed answer engine. They are just the vessel.

It is about solving problems. Problems that drive real value to your business. In your case, this could be about increasing the LTV of a customer. Increasing their retention rate, reducing churn, and increasing the chance of a conversion by providing an improved experience.

If you’ve worked in SEO for long enough, someone will have floated the idea of improving your online sentiment and reviews past you.

“But will this improve our SEO?”

Said Jeff, a beleaguered business owner.

Who knows, Jeff. It really depends on what is holding you back compared to your competition. And like it or not, search is not very investible right now.

But that doesn’t matter in this instance. This isn’t a search-first project. It’s an audience-first project. It encompasses everyone. From customer service to SEO and editorial. It’s just the right thing to do for the business.

A quick hark back to the Google Leak shows you just how many review and sentiment-focused metrics may affect how you rank.

There are nine alone that mention review or sentiment in the title
There are nine alone that mention review or sentiment in the title (Image Credit: Harry Clarkson-Bennett)

For a long time, search has been about brands and trust. Branded search volume, outperforming expected CTR (a Bayesian type predictive model), direct traffic, and general user engagement and satisfaction.

This isn’t because Google knows better than people. It’s because they have stored how we feel about pages and brands in relation to queries and used that as a feedback loop. Google trusts brands because we do.

Most of us have never had to worry about reviews and sentiment. But this is a great time to fix any issues you may have under the guise of AEO, GEO, SEO, or whatever you want to call it.

Lars Lofgren’s article titled How a Competitor Crippled a $23.5M Bootcamp By Becoming a Reddit Moderator is an incredible look at how Codesmith was nobbled by negative PR. Negative PR started and maintained by one Reddit Mod. One.

So keeping tabs on your reputation and identifying potentially serious issues is never a bad thing.

Could I Just Build My Own?

Yep. For starters, you’d need an estimation of monthly LLM API costs based on the number of monthly tokens required. Let’s use Profound’s lower-end pricing tier as an estimate and our old friend Gemini to figure out some estimated costs.

  • 200 prompts × 10 runs × 12 days (approx.) × 3 models = 24,000 monthly runs.
  • 24,000 runs × 1,000 tokens/query (conservative est.) = 24,000,000 tokens.

Based on this, here’s a (hopefully) accurate cost estimate per model from our robot pal.

Image Credit: Harry Clarkson-Bennett

Right then. You now need some back-end functionality, data storage, and some front-end visualization. I’ll tot up as we go.

$21 per month

Back-End

  • A Scheduler/Runner like Render VPS to execute 800 API calls per day.
  • A data orchestrater. Essentially, some Python code to parse raw JSON and extract relevant citation and visibility data.

$10 per month

Data Storage

  • A database, like Supabase (which you can integrate directly through Lovable), to store raw responses and structured metrics.
  • Data storage (which should be included as part of your database).

$15 per month

Front-End Visualization

  • A web dashboard to create interactive, shareable dashboards. I unironically love Lovable. It’s easy to connect directly to databases. I have also used Streamlit previously. Lovable looks far sleeker but has its own challenges.
  • You may also need a visualization library to help generate time series charts and graphs. Some dashboards have this built in.

$50 per month

$96 all in. I think the likelihood is it’s closer to $50 than $100. No scrimping. At the higher end of budgets for tools I use (Lovable) and some estimates from Gemini, we’re talking about a tool that will cost under $100 a month to run and function very well.

This isn’t a complicated project or setup. It is, IMO, an excellent project to learn the vibe coding ropes. Which I will say is not all sunshine and rainbows.

So, Should I Buy One?

If you can afford it, I would get one. For at least a month or two. Review your online sentiment. See what people really say about you online. Identify some low lift wins around product marketing and review/reputation management, and review how your competitors fare.

This might be the most important part of LLM visibility. Set up a tracking dashboard via Google Analytics (or whatever dreadful analytics provider you use) and see a) how much traffic you get and b) whether it’s valuable.

The more valuable it is, the more value there will be in tracking your LLM visibility.

You could also make one. The joy of making one is a) you can learn a new skill and b) you can make other things for the same cost.

Frustrating, yes. Fun? Absolutely.

More Resources: 


This post was originally published on Leadership In SEO.


Featured Image: Viktoriia_M/Shutterstock

Category SEO Generative AI
Harry Clarkson-Bennett SEO Director at Telegraph

SEO Director at The Telegraph with a decade of experience. Unskilled jiu-jitsu guy. Average chess player. Jack of many things.