Key takeaways
- Tracking AI mentions has three layers: the AI-referred traffic in your analytics, manual prompt sampling across engines, and dedicated AI-visibility tools.
- AI answers are highly inconsistent run to run, so a single check is noise. Measure a mention rate across dozens of prompts run repeatedly.
- GA4 added a native “AI Assistant” channel in May 2026, but it recognises only ChatGPT, Gemini and Claude, misses Perplexity, and cannot see referrer-stripped “dark” traffic.
- A large share of AI-referred visits arrive with no referrer and land in Direct, so custom channel groups plus a dedicated tool recover far more than GA4 alone.
- The metrics that matter: mention rate, share of voice, citation position, sentiment, and the conversion rate of AI-referred sessions.
If AI answer engines are sending traffic and shaping how shoppers see your brand, you need to measure it. This guide covers how to track your Shopify brand’s AI search mentions, end to end.
It pairs with our hub on answer engine optimization for Shopify and the engine-specific guides for ChatGPT and Perplexity.
Why you can trust us
We have spent more than four years in the Shopify ecosystem and built Fudge, an AI page builder used by hundreds of merchants, along with the free AI Readiness Checker that scans what AI engines can read on a store.
The reason to bother: AI-sourced traffic to US retail sites grew 393% year over year in the first quarter of 2026, and that traffic converted 42% better than non-AI traffic in March 2026 - a full reversal from a year earlier, when it converted worse.1 AI referrals are now a channel worth measuring properly.
Why one check is not enough
AI answers are not a fixed results page. The same prompt returns different brands on different runs. Research by SparkToro and Gumshoe found that across thousands of responses, there was less than a 1-in-100 chance that ChatGPT or Google AI returned the same brand list in any two runs.2
That does not make tracking hopeless. It means the right unit of measurement is a mention rate: run a fixed set of prompts many times and measure the percentage of runs in which your brand appears. A single manual check tells you almost nothing; a rate across dozens of prompts is a defensible signal.
Practitioners suggest starting with 20 to 40 prompts spread across the buying funnel, chosen to reflect what your customers actually ask.3
Layer 1: Track AI-referred traffic in analytics
The GA4 AI Assistant channel
In May 2026, GA4 added a native “AI Assistant” default channel group. It is a start, but it has real gaps: Google named only ChatGPT, Gemini and Claude, Perplexity often lands in Referral instead, and clicks from Google’s own AI Overviews are counted as Organic Search.4
Build a custom AI channel
Because of those gaps, build a custom channel group or Explore segment that catches every engine. Match the session source against a pattern like:
chatgpt.com | openai.com | perplexity.ai | claude.ai |
gemini.google.com | copilot.microsoft.com | you.com
Place it above Referral in your channel order so these sessions are grouped correctly.4
The dark-traffic problem
Even a good custom channel misses a lot. A large share of AI visits arrive with no referrer - from in-app browsers, mobile apps, or copied-and-pasted links - and land in Direct. One analysis of around 450,000 AI-adjacent visits found roughly 70% arrived with no referrer.5 Assume a meaningful portion of your true AI traffic is hiding in Direct, and treat analytics as a floor, not a full count.
On Shopify specifically
Shopify’s own analytics group AI chatbot referrals under Referrer without a dedicated AI channel, so GA4 is the better layer for this. Tie AI-referred sessions to conversions and revenue using your existing conversion tracking setup, and if you route events through Google Tag Manager on Shopify, add the AI source pattern there too.
Layer 2: Sample prompts manually
The cheapest way to start is by hand. Build your prompt set, then run each prompt across ChatGPT, Perplexity, Gemini and Claude on a regular cadence, logging whether your brand appears, in what position, and how it is described.
Keep the caveats in mind: answers vary between runs, personalisation affects results, and there is no fixed ranking. Run each prompt several times, and track the direction over weeks rather than reacting to any single answer.
Layer 3: Dedicated AI-visibility tools
When manual sampling gets tedious, dedicated tools run large prompt sets across engines automatically and report mention rate, share of voice, citations and sentiment. Pricing in this category changes often and much of it sits behind sales calls, so treat the figures below as starting points to verify.
| Tool | What it tracks | Reported entry price |
|---|---|---|
| Profound | Mentions, share of voice, citations across many engines; enterprise-grade | From around $99/mo |
| Peec AI | Daily visibility, competitor share of voice, sources | From around €89/mo |
| Otterly.ai | Brand mentions, links and sentiment; fast setup | From $29/mo |
| Semrush AI Visibility | Mentions, sentiment, daily prompt tracking | From $99/mo per domain |
| Ahrefs Brand Radar | AI mentions layered on Ahrefs data | Base plan from $129/mo + AI add-on |
| Rankscale | Visibility, citations, sentiment; wide engine coverage | From $20/mo |
For most Shopify stores, the sensible path is to start with manual sampling and a custom GA4 channel, then add a paid tool once AI is a channel worth managing weekly.
What to actually measure
Track a small, consistent set of metrics over time:
- Mention rate - the share of tracked prompts where your brand appears.
- Share of voice - your presence versus named competitors across the set.
- Citation position - whether you are the first source named or the fifth.
- Sentiment - how the engine describes you.
- Referral outcomes - sessions, conversion rate and revenue from AI-referred traffic.
You can also get a point-in-time read on how ready your store is to be cited: the AI Readiness Checker scans exactly what an AI engine can and cannot read on your store, which explains a lot of what the tracking above will later show.
FAQ
Use three layers: filter your analytics for AI referral sources, manually sample a fixed set of prompts across ChatGPT, Perplexity, Gemini and Claude on a regular cadence, and optionally add a dedicated AI-visibility tool. Because answers vary run to run, measure a mention rate across many prompts rather than checking once.
Partly. GA4's native AI Assistant channel, added in May 2026, recognises ChatGPT, Gemini and Claude but often misses Perplexity and cannot see referrer-stripped traffic. Build a custom channel group matching AI source domains, and expect a share of real AI traffic to still land in Direct.
AI answers are generated fresh and are inherently inconsistent. Research found less than a 1-in-100 chance that two runs of the same prompt return the same brand list. That is why a single check is unreliable and you should measure a mention rate across many prompts run repeatedly.
Not to start. A custom GA4 channel plus manual prompt sampling costs nothing and is enough for most stores early on. Add a paid tool like Profound, Peec, Otterly or Rankscale once AI is a channel you want to manage weekly and manual sampling becomes too slow.
A large share. One analysis of around 450,000 AI-adjacent visits found roughly 70% arrived with no referrer and were logged as Direct in GA4. Assume your analytics undercount AI traffic and treat the measured figure as a floor.
Footnotes
-
Adobe Analytics, via TechCrunch (Apr 16, 2026): AI-sourced traffic to US retail sites grew 393% year over year in Q1 2026 and converted 42% better than non-AI traffic in March 2026, reversing a year-earlier deficit. https://techcrunch.com/2026/04/16/ai-traffic-to-us-retailers-rose-393-in-q1-and-its-boosting-their-revenue-too/ ↩
-
SparkToro and Gumshoe research (2025), 2,961 responses across ChatGPT, Claude and Google AI: less than a 1-in-100 chance that two runs returned the same brand list, which is why mention rate across many prompts is the defensible metric. https://sparktoro.com/blog/new-research-ais-are-highly-inconsistent-when-recommending-brands-or-products-marketers-should-take-care-when-tracking-ai-visibility/ ↩
-
On sizing a tracking prompt set to roughly 20 to 40 prompts across the buying funnel. https://seranking.com/blog/how-to-choose-prompts-to-track/ ↩
-
Search Engine Journal (May 14, 2026): GA4’s new AI Assistant default channel group recognises ChatGPT, Gemini and Claude, often excludes Perplexity, and counts Google AI Overview clicks as Organic Search. https://www.searchenginejournal.com/google-analytics-adds-ai-assistant-as-default-channel-group/574974/ ↩ ↩2
-
Retailgentic, “Dark Agentic Commerce Traffic”: in an analysis of roughly 450,000 AI-adjacent visits, about 70% arrived with no referrer and were logged as Direct in GA4. https://www.retailgentic.com/p/dark-agentic-commerce-traffic-dact ↩