AI Search Blueprint: Entity Maps, Structured Data, IndexNow & The Basics

AI Search Blueprint: Entity Maps, Structured Data, IndexNow & The Basics

We know how overwhelming it can be to keep pace with rapidly-changing SEO strategies, especially with AI switching up the rules faster than ever.

This collection of insights was handpicked to give you clarity and confidence on what really matters right now: how to make your content, structure, and strategy AI-ready, from basics to technical necessities.

If you’re redefining your SEO strategy for the age of AI, we hope these pieces become your go-to guide for navigating what’s next in search.

Keep testing and adjusting your strategy as the market changes, just like we do. And as always, let us know what you’d like to see more of.

Katie Morton
Katie Morton Editor in Chief, Search Engine Journal

AI Search Optimization: Front-End and Back-End Website Strategies for Visibility

Discover practical front-end and back-end strategies for AI search optimization that help your brand gain visibility in Large Language Models and AI Overviews.

Outerbox Outerbox
AI Search Optimization: Front-End and Back-End Website Strategies for Visibility

Content Isn’t Enough to Succeed in AI Search

E-E-A-T and technical SEO still matter, but LLMs add a new dimension: visibility now depends on whether your brand appears in live web results and in the model’s training memory.

While AIOs and LLMs rely on many of the same signals as SEO, such as clarity, structure, and authority, there’s another major factor—training data. Now, thinking about content depends on how your buyers phrase their prompts.

Let’s break this down further:

  • Live Web Indexes (Google and Bing’s crawlers): Here, SEO strategies, technical optimization, and structured content directly influence your visibility. Timely, specific, or technical queries are pulled from here, as are real-time needs (pricing, hours, availability), or niche long-tail searches.
  • Model Training Data: This refers to the knowledge embedded in the LLM itself, which relies on what the model has learned and retained. While evergreen questions, general best practices, or long-established facts are a part of this training, visibility for brands to populate answers depends on whether your brand had enough authority (citations, PR, backlinks, thought leadership) before the model’s training cutoff.

Every brand now faces a split discovery model: are your audiences asking questions anchored in static knowledge or in a live context? Knowing which side of that divide you serve determines how you invest.

Visibility in AI search isn’t just about great content—it’s about where that content lives and how it’s interpreted. To compete, websites must now operate on two fronts: the visible, front-end authority that teaches models who you are, and the technical, back-end foundation that ensures your content can be found and read. Let’s unpack both.

The Dual Path to AI Search Visibility

So, how do you optimize or redesign/build for the two distinct systems dictating visibility in LLMs and AIOs (one of which still powers SEO)?

Your website works on two fronts:

  • Front-end Strategies: This includes content, messaging, and E-E-A-T signals. These build your brand’s authority over time. Content hubs, thought leadership, PR mentions, and case studies aren’t just for SEO rankings—they’re the very signals training models intake when deciding which brands to remember in their datasets. If your brand hasn’t established that footprint yet, this effort won’t yield results instantly. The play is to invest in content that others cite, share, and link to. Those external authority signals are what models carry forward in their training cycles.
  • Back-end Strategies: This includes technical SEO, structured data, and product feeds— the infrastructure layer supporting SEO and LLM/AIO. It won’t rewrite training data, but it ensures your site is accessible, machine-readable, and index-friendly. These tactics increase the likelihood that your content will surface when an AI system pulls from the live web index.

In short, the front-end builds authority that training data can carry forward, while the back-end ensures discoverability whenever prompts tap the live index.

Let’s examine tactics for each. If the back-end is your website’s infrastructure, the front-end is your reputation—so let’s begin with how to build the authority AI can’t ignore.

Front-End Optimizations: Building Authority

As we noted earlier, authority signals help determine whether your content becomes part of an AI model’s ‘memory.’ These front-end optimizations are how you build that foundation.

How do you know if your website’s front-end has enough authority to be recognized by training models? Look at the same signals you’d use for SEO credibility:

  • Are you cited or referenced by other authoritative domains?
  • Do your executives, products, or brand appear in knowledge panels or sites like Quora?
  • Are you publishing original research, case studies, or data that earns backlinks and media coverage?

If you haven’t built that foundation yet, start now, as authority compounds; however, it cannot be retrofitted. Here are four tactics to help.

1. Use Content Hubs to Signal Authority

AI doesn’t just reward keyword-rich content—it rewards structured authority. Interlinked pillar pages supported by blogs, FAQs, and resource articles help LLMs understand your expertise:

  • Organize content around buyer questions, not just keywords.
  • Use Q&A-style formatting that mirrors how users phrase prompts.
  • Refresh evergreen pages every 3–6 months to stay top of mind for AI systems.

Content hubs provide SEO the signals it needs today and give LLMs the authority cues they’ll need tomorrow.

2. Amplify with User-Generated Content (UGC)

LLMs value authenticity. Reviews, Q&A sections, and customer commentary provide precisely the type of “fresh signals” AI systems trust:

  • Encourage reviews and testimonials directly on product and service pages.
  • Add structured Q&A blocks for customer questions.
  • Pull insights from forums like Reddit and user communities into your own content ecosystem.

For marketers, this is an opportunity to let your customers’ voices amplify your brand’s authority.

3. Optimize for Long-Tail, Conversational Queries

Data shows LLMs increasingly cite sources well beyond Google’s top ten, so optimizing for long-tail queries is no longer a “nice to have”:

  • Target technical, niche, or question-based content.
  • Include rich examples, comparisons, and definitions to align with AI’s preference for depth.

So instead of “best plumber near me”, think more specific. Try “cost of trenchless pipe repair for older homes”.

4. Merchant Feeds

For eCommerce websites, visibility depends on your Merchant Feeds. These should be complete and AI-friendly. Ensure feeds include:

  • Optimize product titles and descriptions
  • Add GTIN, size, color, material, and sustainability tags
  • Ratings and reviews are connected directly to product listings

If feeds are incomplete, your products may never appear in AI-driven shopping results—regardless of how well your site ranks organically.

Merchant data is one of the few structured inputs LLMs treat as verified context, giving complete feeds outsized influence in AI-driven commerce.

Now, let’s look at back-end improvements, the one lever you can pull immediately to see results sooner.

Back-End Optimizations: Building Infrastructure for All Search

Just as front-end authority helps models remember your brand, back-end precision ensures they can find and understand it. Structured markup, entity linking, clean site architecture, and technical SEO play a decisive role in how brands surface when customer prompts rely on the live web index.

Together, these back-end optimizations act like a translator between your content and the systems parsing it. They tell traditional search and AI search models, “Here’s what this means, here’s how it connects, and here’s why it should surface,” which impacts whether your website gets cited.

1. Structured Data = Context for AI

LLMs and search engines lean on structured data to connect the dots. Marketers should ensure their teams implement schema markup for articles, FAQs, products, reviews, authors, and organizations.

Additionally, assign @id properties to tie entities, such as brands, products, and executives. This “name-tagging” provides AI systems with the context they need to represent your brand accurately.

By consistently using the same @id across your site, you’re building a semantic graph—a network of connected relationships. For example:

  • Your CEO’s author bio (@id: /executives/name) links to your company schema (@id: /organization/company).
  • A product (@id: /products/widget-123) ties to your brand entity.

The cleaner and more consistent your entity linking, the more confidently AI can represent your brand in its answers. When entity relationships are explicit, models don’t have to infer—and that alone moves your brand up the confidence hierarchy.

2. Keep Content Crawlable with Less JavaScript

AI models aren’t browsers; they parse at scale and primarily read raw HTML. If critical content on your site only loads through JavaScript, there’s a high chance LLMs and AIOs won’t see it. JavaScript rendering builds content in the browser after scripts run. This is where server-side rendering becomes essential—it ensures that key text, schema, and structured data are accessible without requiring scripts to run. Server-side rendering (SSR) delivers the full HTML content of a page from the server, so search engines and AI models can read it immediately.

You don’t need to eliminate JavaScript—it’s valuable for interactivity and user experience—but it should never be the only way important content, navigation, or metadata is displayed. A good test: turn JavaScript off in your browser. If core content or schema disappears, so will your visibility in AI search results.

3. Emerging AI Protocols

As AI search evolves, a handful of new technical protocols are becoming baseline requirements for brands that want consistent visibility in AI results:

  • IndexNow: This protocol instantly notifies search engines when content is created, updated, or deleted on your site. Instead of waiting for crawlers to revisit, IndexNow pushes updates to AIOs, LLMs, and traditional search engines. For marketers, that means faster visibility for new campaigns, product launches, or time-sensitive offers.
  • LLMs.txt: Similar to robots.txt, this file format allows you to signal to AI models which resources are designed for LLMs. It signals, “Here’s where the clean, structured answers live.” It doesn’t guarantee inclusion, but it gives models a better chance of parsing your most authoritative content.

These protocols directly influence how quickly and clearly your brand is understood by AI systems. Falling behind here doesn’t just mean slower SEO results—it could mean your brand is invisible in AI-driven search altogether.

4. Maintain Technical SEO Hygiene

Technical SEO is not a new tactic, but it’s worth the reminder. Without a solid technical foundation, even the best content won’t surface as a citation.

  • Performance and Speed: Subpar load times hurt crawl budgets, weaken engagement signals, and reduce eligibility for AI parsing. Models are biased toward sources they can parse quickly and consistently, making Core Web Vitals and lightweight code critical.
  • Mobile Optimization: With long-established mobile-first indexing, responsive design is non-negotiable. AIOs also lean on mobile-friendly content to ensure broad accessibility, so poor responsiveness directly reduces visibility.
  • Site Architecture and Internal Linking: Flat, logical structures with consistent hierarchies facilitate the mapping of entity relationships. Internal linking reinforces topical clusters and keeps high-value content discoverable, rather than buried in orphan pages.
  • Schema Integrity and Code Quality: Broken markup or messy code introduces ambiguity. LLMs and search engines rely heavily on schema as trust signals—errors or inconsistencies reduce the likelihood of citation.
  • Security (HTTPS): Still a ranking signal and a baseline trust factor. For AI systems surfacing recommendations, unsecured sites introduce risk and are less likely to be prioritized.

Regularly check your technical SEO hygiene to ensure simple fixes, such as these, are addressed promptly.

Investing in Your Website Is Investing in AI Visibility

AI search is a structural shift in how buyers discover, evaluate, and trust brands. And while the tactics may evolve, the throughline is clear: your website is still the nucleus of it all.

Front-end authority building ensures your brand is recognized and remembered by training models. Back-end technical precision makes sure your content is crawlable, connected, and ready when prompts hit the live web index. Skip one side of the equation, and you’re leaving opportunities on the table.

The reality is that brand visibility depends on your website investment. It’s time to look at user intent beyond content. Focus on how your front-end and back-end development tactics can help populate potential customer queries for future training models and the live web index. This is the strategy marketers will need to succeed.

Sponsored
Master front-end and back-end tactics that boost visibility in AI Overviews and LLMs and optimize your brand’s search performance today.
Claim Your Spot in AI Search

How To Get Your Content (& Brand) Recommended By AI & LLMs

Want your content and brand cited in AI results? Focus on substance, not shortcuts. Here's the strategy that works.

Andreas Voniatis Andreas Voniatis 21K Reads
How To Get Your Content (& Brand) Recommended By AI & LLMs

The game has changed, and quite recently, too.

Generative engine optimization (GEO), AI Overviews (AIOs), or just an extension of SEO (now being dubbed on LinkedIn as Search Everywhere Optimization) – which acronym is correct?

I’d argue it’s GEO, as you’ll see why. And if you’ve ever built your own large language model from scratch like I did in 2020, you’ll know why.

We’ve all seen various frightening (for some) data on how click-through rates have now dropped off the cliff with Google AIOs, how LLMs like ChatGPT are eroding Google’s share of search – basically “SEO is dead” – so I won’t repeat them here.

What I will cover are first principles to get your content (along with your company) recommended by AI and LLMs alike.

Everything I disclose here is based on real-world experiences of AI search successes achieved with clients.

Using an example I can talk about, I’ll go with Boundless as seen below.

Screenshot by author, July 2025

Tell The World Something New

Imagine the dread a PR agency might feel if it signed up a new business client only to find they haven’t got anything newsworthy to promote to the media – a tough sell. Traditional SEO content is a bit like that.

We’ve all seen and done the rather tired ultimate content guide to [insert your target topic] playbooks, which attempt to turn your website into the Wikipedia (a key data source for ChatGPT, it seems) of whatever industry you happen to be in.

And let’s face it, it worked so well, it ruined the internet, according to The Verge.

The fundamental problem with that type of SEO content is that it has no information gain. When trillions of webpages all follow the same “best practice” playbook, they’re not telling the world anything genuinely new.

You only have to look at the Information Gain patent by Google to underscore the importance of content possessing value, i.e., your content must tell the world (via the internet) something new.

BoundlessHQ commissioned a survey on remote work, asking ‘Ideally, where would you like to work from if it were your choice?’

The results provided a set of data and this kind of content is high effort, unique, and value-adding enough to get cited in AI search results.

Of course, it shouldn’t take AI to produce this kind of content in the first place, as that would be good SEO content marketing in any case. AI has simply forced our hand (more on that later).

After all, if your content isn’t unique, why would journalists mention you? Bloggers link back to you? People share or bookmark your page? AI retrain its models using your content or cite your brand?

You get the idea.

For improved AI visibility, include your data sources and research methods with their limitations, as this level of transparency makes your content more verifiable to AI.

Also, updating your data more regularly than annually will indicate reliability to AI as a trusted information source for citation. What LLM doesn’t want more recent data?

SEO May Not Be Dead, But Keywords Definitely Are

Keywords don’t tell you who’s actually searching. They just tell you what terms trigger ads in Google.

Your content could be appealing to students, retirees, or anyone. That’s not targeting; that’s one size fits all. And in the AI age, one size definitely doesn’t fit all.

So, kiss goodbye to content guides written in one form of English, which win traffic across all English-speaking regions.

AI has created more jobs for marketers, so to win the same traffic as before, you’ll need to create the same content as before for those English-speaking regions.

Keyword tools also allegedly tell you the search volumes your keywords are getting (if you still want them, we don’t).

So, if you’re planning your content strategy on keyword research, stop. You’re optimizing for the wrong search engine.

What you can do instead is a robust market research based on the raw data sources used by LLMs (not the LLM outputs themselves). For example, Grok uses X (Twitter), ChatGPT has publishing partnerships, and so on.

The discussions are the real topics to place your content strategy around, and their volume is the real content demand.

AI Inputs, Not AI Outputs

I’m seeing some discussions (recommendations even) that creating data-driven or research-based content works for getting AI recommendations.

Given the dearth of true data-driven content that AI craves, enjoy it while it lasts, as that will only work in the short term.

AI has raised the content bar, meaning people are specific in their search patterns, such is their confidence in the technology.

Therefore, content marketers will rise to the challenge to produce more targeted, substantial content.

But, even if you are using LLMs in “deep” mode on a premium subscription to inject more substance and value into your content, that simply won’t make the AI’s quality cut.

Expecting such fanciful results is like asking AI to rehydrate itself using its sweat.

The results of AI are derivative, diluted, and hallucinatory by nature. The hallucinatory nature is one of the reasons why I don’t fear LLMs leading to artificial general intelligence (AGI), but that’s another conversation.

Because of the value degradation of the results, AI will not want to risk degrading its models on content founded on AI outputs for fear of becoming dumber.

To create content that AI prefers, you need to be using the same data sources that feed AI engines. It’s long been known that Google started its LLM project over a decade ago when it started training its models on Google Books and other literature.

While most of us won’t have the budget for an X.com data firehose, you can still find creative ways (like we have), such as taking out surveys with robust sample sizes.

Some meaningful press coverage, media mentions, and good backlinks will be significant enough to shift AI into seeing the value of your content, being judged good enough to retrain its models and update its worldview.

And by data-mining the same data sources, you can start structuring content as direct answers to questions.

You’ll also find your content is written to be more conversational to match the search patterns used by your target buyers when they prompt for solutions.

SEO Basics Still Matter

GEO and SEO are not the same. The reverse engineering of search engine results pages to direct content strategy and formulation was effective because rank position is a regression problem.

In AI, there is no rank; there are only winners and losers.

However, there are some heavy overlaps that won’t go away and are even more critical than ever.

Unlike SEO, where more word count was generally more, AI faces the additional constraints of rising energy costs and shortages of computer chips.

That means content needs to be even more efficient than it is for search engines for AI to break down and parse meaning before it can determine its value.

So, by all means:

  • Code pages for faster loading and quicker processing.
  • Deploy schema for adding context to the content.
  • Build a conversational answer-first content architecture.
  • Use HTML anchor jump links to different sections of your content.
  • Open your content to LLM crawling and use llms.txt file.
  • Provide programmatic content access, RSS feeds, or other.

These practices are more points of hygiene to help make your content more discoverable. They may not be a game changer for getting your organization cited by AI, but if you can crush GEO, you’ll crush SEO.

Human, Not AI-Written

AI engines don’t cite boring rehashes. They’re too busy doing that job for us and instead cite sources for their rehash instead.

Now, I have heard arguments say that if the quality of the content (let’s assume it even includes information gain) is on point, then AI shouldn’t care whether it was written by AI or a human.

I’d argue otherwise. Because the last thing any LLM creator wants is their LLM to be retrained on content generated by AI.

While it’s unlikely that generative outputs are tagged in any way, it’s pretty obvious to humans when content is AI-written, and it’s also pretty obvious statistically to AI engines, too.

LLMs will have certain tropes that are common to AI-generated writing, like “The future of … “.

LLMs won’t default to generating lived personal experiences or spontaneously generating subtle humour without heavy creative prompting.

So, don’t do it. Keep your content written by humans.

The Future Is A New Targeted Substantial Value

Getting your content and your company recommended by AI means it needs to tell the world something new.

Make sure it offers information gain based on substantive, non-LLM-derived research (enough to make it worthy of LLM model inclusion), nailing the SEO basics, and keeping it human-written.

The question now becomes, “What can you do to produce high-effort content good enough for AI without costing the earth?”

More Resources:


Featured Image: Collagery/Shutterstock

Internal Linking Grows Up: Evolving From Link Juice To Entity Maps

From PageRank flow to entity mapping, internal linking has evolved into one of SEO’s most powerful levers for organic visibility.

Kevin Indig Kevin Indig 5.4K Reads
Internal Linking Grows Up: Evolving From Link Juice To Entity Maps

Let’s reminisce for a moment. Do you remember how, back in 2020, we all obsessed over “link juice” and PageRank flow as far as internal links are concerned?

In 2025, what matters more is how your internal links define the entities and relationships on your site.

Internal linking is no longer just about distributing authority. It’s about:

  • Building your own semantic map that Google can trust.
  • Reinforcing your topical authority.
  • Earning a place in an AI-search-forward landscape.

The last full guide I wrote on internal linking strategies was in 2020, and – well – much has happened since then (to say the least).

And most internal linking guides treat links as simple “traffic routers,” ignoring their role in building entity context.

So today, yes, I’m revisiting some of the basic building blocks of SEO, but we’re going to expand how we think about internal linking.

If you’re already deep into entity-first SEO and apply it to your internal linking tactics, skip ahead to the action items to ensure you’re implementing it well.

For everyone else, I’ll explain why tightening up your internal linking structure isn’t just table stakes. It’s one of the simplest core levers to influence organic visibility.

Image Credit: Kevin Indig

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

Why Is Strategic Internal Linking Still Important For SEO/Organic Visibility?

Internal linking is the age-old SEO practice of connecting one page on your site to another page, all on the same domain.

These links act like the roads or highways that guide users through your content. But they also help search engines understand how your pages relate.

In the past, we thought about internal links as “pipes” for PageRank.

Add enough links from your homepage or other strong, well-ranking pages, and you’d push authority toward the URLs you wanted to rank.

That view isn’t wrong; it’s just incomplete.

Today, internal links aren’t just distributing authority. They’re defining the semantic structure of your site.

Internal linking isn’t simply a practice that routes people (and bots/crawlers) to the pages you want them to go to.

In fact, when we think about internal linking this way is exactly when we start to half-ass the practice or let it sit on the back burner.

The words you use in anchor text and the way you connect hubs of related content all signal to search engines: These are the entities your brand wants to be known for.

Strategic internal linking can do three critical things for your site:

  1. Reinforce entity authority. You’re signaling to Google, and everyone else, which concepts you want associated with your brand.
  2. Improve index stability. Pages that are well-linked internally are more likely to be crawled often – and that means they stay indexed and are likely to show up in AI-generated results. (This is especially for Bing optimization, which seems to struggle more with indexing than Google. Bing is often forgotten when it comes to AEO/GEO because everyone assumes ChatGPT only uses Google, but it doesn’t.)
  3. Drive user engagement. Smart placement and descriptive anchors help users explore more of your related content, increasing engagement signals.

Put simply: Internal links aren’t just SEO plumbing. They’re how you build a discoverable, authoritative entity graph inside your own site.

Do LLMs Register Internal Links?

Generative AI being infused into all modalities of search means Google and LLMs aren’t just hiking all over the web searching for crawlable/indexable pages — search engines and LLMs are mapping relationships between entities and judging your brand’s authority accordingly.

But currently, there’s some disagreement on whether or not LLMs can navigate your site through internal links.

My hypothesis? LLMs do form entity relationships via your strategic use of internal links. But probably not through traditionally “crawling” them like search engines do, and more purely based on text signals on the page.

And if that turns out to be true – keeping in mind that LLMs often use search engine results to ground themselves – internal linking also benefits LLM optimization/AEO/GEO mostly by improving Google/Bing ranks, which LLMs heavily rely on.

I dropped the question over on LinkedIn, you can check out the discussion there. But a few responses stood out. (Take a look at the full thread, but I also highly recommend following these pros to learn more from each of them.)

Dan Petrovic, founder and CEO of Dejan SEO, gave a detailed answer about the differences between a) the types of LLM crawlers and b) the different LLMs and how they behave.

Image Credit: Kevin Indig

Lily Grozeva, head of SEO at Verto Digital, rightfully called out that we can all get the answer in our own logfiles.

Image Credit: Kevin Indig

Chee Lo, head of SEO at Trustpilot, shared his experience with Perplexity, which seems to be a bit more aggressive than other bots.

Image Credit: Kevin Indig

Why Thinking In Entities Can Change The Game For Internal Linking

Sites with clear internal linking patterns that mirror how humans connect concepts are (in theory, more data will tell over time) better positioned to be included in AI-generated answers and entity-rich snippets.

Way back in 2019, I explained the following in Semantic content optimization with entities:

Entities are semantic, interconnected objects that help machines to understand explicit and implicit language. In simpler terms, they are words (nouns) that represent any type of object, concept, or subject … According to Cindy Krum and her fantastic entity series, Google seems to restructure its whole approach to indexing based on entities (while you’re at it, read AJ Kohn’s article about embeddings). Understanding entities and how Google uses them in search sharpens our standards for content creation, optimization, and the use of schema markup.

Entities are nouns like events, ideas, people, places, etc. They’re the building blocks of ideas and how those ideas relate to each other. (They’re not just “keywords.”)

Search engines and LLMS use semantic relationships between entities to (1) reduce ambiguity, (2) reinforce authority/canonical sources on your site, and (3) map out relationships between topics, features, services, and audiences across your site.

When you internally link pages together with strategically descriptive anchors, you’re telling search engines how your site fits together … and you’re training them on how entities across your site connect.

Therefore, by practicing internal linking through an entity-based lens, you’re creating stronger, clearer relationships and patterns for Google/search engines/LLMs to understand.

Using Internal Links As Entity Connectors – How To Do It

Entity-first SEO starts with defining the people, products, concepts, and places your brand “owns.”

If you’re a B2B SaaS company offering a CRM, those entities might include your:

  • Core product (CRM platform).
  • Features (pipeline management, email automation, reporting dashboards).
  • Use cases (sales enablement, customer support, marketing teams).
  • Personas/target ICPs (heads of sales at mid-market companies, startup founders scaling revenue teams, or enterprise IT buyers).

Taking this example, you’re going to think in terms of topic-first SEO:

  • Hub or pillar pages = parent entities. These are your central nodes – the definitive resource on a core concept. For a B2B SaaS CRM, it might be the CRM platform overview page.
  • Cluster pages = sub-entities. These are the supporting nodes that expand on the hub. For a CRM, the CRM hub branches into feature pages like pipeline management, email automation, and reporting dashboards.
  • Cross-link clusters to show relatedness. Don’t just point everything back to the hub – connect the clusters to each other to model real-world relationships. In the instance of the CRM, pipeline management integrates with email automation to shorten deal cycles.
  • Navigation and breadcrumbs reinforce hierarchy. The visible structure tells both users and Google how entities fit together. Example: Home → Products → CRM → Pipeline Management.
  • Include personas in the implementation. This reinforces the relationship: This persona → has this pain point → solved by this feature → within this product topic.

For example, look at this topic cluster map created with Screaming Frog:

Image Credit: Kevin Indig

It shows two clusters with nodes very close together (red and orange) and three other clusters that are spread apart (green, blue, and purple). Guess which clusters outperform the others in organic search? Red and orange!

Here’s how you connect those entities into a meaningful structure in the copy on the page:

1. Anchor text = entity disambiguation.

Instead of linking with vague text, use descriptive anchors that clarify which entity the link refers to. For example, if your CRM has a feature page about pipeline management, link to it with “sales pipeline management CRM feature” language.

2. Consistency matters.

If you always link to that pipeline management page with variations like “pipeline automation tool,” “deal tracking software,” and “CRM feature,” you dilute the entity connection. (But variations like “pipeline management tool,” “sales pipeline management CRM feature,” and “pipeline management features” are derivatives.)

By sticking to clear, consistent anchors, you signal to Google that this is the page that defines “pipeline management” for your brand.

3. Context strengthens meaning.

The sentence or paragraph around the link can add semantic weight. For example:

“Our CRM includes pipeline management, so your sales team can track every deal from prospecting to close.”

That tells Google (and users) that pipeline management isn’t just a phrase; it’s a core feature within the CRM product.

4. Include personas.

Making personas a criterion for internal linking is a no-brainer, because from a psychological perspective, a link automatically signals “there’s more for you here.”

If your internal link is placed on the right word that triggers a response in your target ICPs (and the right areas of the page), it increases the chance of people staying on the site. It’s also just a better experience – and good customer service – to help site visitors find the right offering specifically for themselves, all with the goal to increase trust and the chances they take an action or convert.

If one of your ICPs is head of Sales at mid-market SaaS companies, you might internally link from a blog article like “10 Ways SaaS Sales Leaders Can Shorten Their Sales Cycle” directly to your pipeline management feature page, while using copy surrounding that link that explains how your offering solves this problem. That link makes the relationship explicit: This is the feature that solves this persona’s pain point.

Ultimately, think of every internal link as a connector in your brand’s knowledge graph.

Together, these links show how entities and topics (like CRM platform → pipeline management → sales enablement → head of sales persona) relate to each other, and why your site is authoritative on them.

Amanda Johnson jumping in here to add: Basically, show + tell people (and search engines/LLMs) what you want them to know via literal semantics. It really is that simple. No need to overthink this. Use clear, descriptive, accurate anchor text for the internally linked page, use it consistently, and give context as to how/why the page is linked there with surrounding copy.

Ultimately, if you practice internal linking thoughtfully and methodically, you end up with a better user experience and more thorough reinforcement of internal entity relationships (which can improve topical authority signals).

Worried that your most important pages aren’t getting enough visibility because you haven’t set up a clear linking structure? Following the guidance above will help you resolve this and set up a clear internal linking system.

And using tools that have internal link auditing (like Semrush, Ahrefs, Clearscope, Surfer, etc.) will help you implement your system. Some SEO tools also give page-level internal linking recommendations and copy suggestions to anchor the text to.

It’s Time To Update How You Think About Internal Linking

Internal linking hasn’t just been about crawlability for some time now.

By structuring links around topics, entities, (and even user journeys of your target personas), you communicate your site’s semantic map to Google and LLMs.


Featured Image: Paulo Bobita/Search Engine Journal

Sponsored
Master front-end and back-end tactics that boost visibility in AI Overviews and LLMs and optimize your brand’s search performance today.
Claim Your Spot in AI Search

Use IndexNow For AI Search And Shopping SEO

IndexNow, paired with structured data, is an ideal way for online stores to get product data into AI-powered search.

Roger Montti Roger Montti 4.0K Reads
Use IndexNow For AI Search And Shopping SEO

Microsoft Bing published an announcement stating that the IndexNow search crawling technology is a powerful way for ecommerce companies to surface the latest and most accurate shopping-related information in AI Search and search engine shopping features.

Generative Search Requires Timely Shopping Information

Ecommerce sites typically depend on merchant feeds, search engine crawling and updates to Schema.org structured data to communicate what’s for sale, new products, retired products, changes to prices, availability and other important features. Each of those methods can be a point of failure due to slow crawling by search engines and inconsistent updating which can delay the correct information from surfacing in AI search and shopping features.

IndexNow solves that problem. Content platforms like Wix, Duda, Shopify and WooCommerce support IndexNow, a Microsoft technology that enables speeding indexing of new or updated content. Pairing IndexNow with Schema.org assures fast indexing so that the correct information surfaces in AI Search and shopping features.

IndexNow recommends the following Schema.org Product Type properties:

  • “title (name in JSON-LD)
  • description
  • price (list/retail price)
  • link (product landing page URL)
  • image link (image in JSON-LD)
  • shipping (especially important for Germany and Austria)
  • id (a unique identifier for the product)
  • brand
  • gtin
  • mpn
  • datePublished
  • dateModified
  • Optional fields to further enhance context and classification:
  • category (helps group products for search and shopping platforms)
  • seller (recommended for marketplaces or resellers)
  • itemCondition (e.g., NewCondition, UsedCondition)”

Read more at Microsoft Bing’s Blog:

IndexNow Enables Faster and More Reliable Updates for Shopping and Ads

Related: The Future Of SEO In The Age Of AI

Featured Image by Shutterstock/Paper piper

Structured Data’s Role In AI And AI Search Visibility

Google, Microsoft, and OpenAI are clear: Structured data shapes AI visibility. Is your brand ready? Read the full breakdown.

Martha van Berkel Martha van Berkel 6.7K Reads
Structured Data’s Role In AI And AI Search Visibility

The way people find and consume information has shifted. We, as marketers, must think about visibility across AI platforms and Google.

The challenge is that we don’t have the same ability to control and measure success as we do with Google and Microsoft, so it feels like we’re flying blind.

Earlier this year, Google, Microsoft, and ChatGPT each commented about how structured data can help LLMs to better understand your digital content.

Structured data can give AI tools the context they need to determine their understanding of content through entities and relationships. In this new era of search, you could say that context, not content, is king.

Schema Markup Helps To Build A Data Layer

By translating your content into Schema.org and defining the relationships between pages and entities, you are building a data layer for AI. This schema markup data layer, or what I like to call your “content knowledge graph,” tells machines what your brand is, what it offers, and how it should be understood.

This data layer is how your content becomes accessible and understood across a growing range of AI capabilities, including:

  • AI Overviews
  • Chatbots and voice assistants
  • Internal AI systems

Through grounding, structured data can contribute to visibility and discovery across Google, ChatGPT, Bing, and other AI platforms. It also prepares your web data to be of value to accelerate your internal AI initiatives as well.

The same week that Google and Microsoft announced they were using structured data for their generative AI experiences, Google and OpenAI announced their support of the Model Context Protocol.

What Is Model Context Protocol?

In November 2024, Anthropic introduced Model Context Protocol (MCP), “an open protocol that standardizes how applications provide context to LLMs” and was subsequently adopted by OpenAI and Google DeepMind.

You can think of MCP as the USB-C connector for AI applications and agents or an API for AI. “MCP provides a standardized way to connect AI models to different data sources and tools.”

Since we are now thinking of structured data as a strategic data layer, the problem Google and OpenAI need to solve is how they scale their AI capabilities efficiently and cost-effectively. The combination of structured data you put on your website, with MCP, would allow accuracy in inferencing and the ability to scale.

Structured Data Defines Entities And Relationships

LLMs generate answers based on the content they are trained on or connected to. While they primarily learn from unstructured text, their outputs can be strengthened when grounded in clearly defined entities and relationships, for example, via structured data or knowledge graphs.

Structured data can be used as an enhancer that allows enterprises to define key entities and their relationships.

When implemented using Schema.org vocabulary, structured data:

  • Defines the entities on a page: people, products, services, locations, and more.
  • Establishes relationships between those entities.
  • Can reduce hallucinations when LLMs are grounded in structured data through retrieval systems or knowledge graphs.

When schema markup is deployed at scale, it builds a content knowledge graph, a structured data layer that connects your brand’s entities across your site and beyond. 

A recent study by BrightEdge demonstrated that schema markup improved brand presence and perception in Google’s AI Overviews, noting higher citation rates on pages with robust schema markup.

Structured Data As An Enterprise AI Strategy

Enterprises can shift their view of structured data beyond the basic requirements for rich result eligibility to managing a content knowledge graph.

According to Gartner’s 2024 AI Mandates for the Enterprise Survey, participants cite data availability and quality as the top barrier to successful AI implementation.

By implementing structured data and developing a robust content knowledge graph you can contribute to both external search performance and internal AI enablement.

A scalable schema markup strategy requires:

  • Defined relationships between content and entities: Schema markup properties connect all content and entities across the brand. All page content is connected in context.
  • Entity Governance: Shared definitions and taxonomies across marketing, SEO, content, and product teams.
  • Content Readiness: Ensuring your content is comprehensive, relevant, representative of the topics you want to be known for, and connected to your content knowledge graph.
  • Technical Capability: Cross-functional tools and processes to manage schema markup at scale and ensure accuracy across thousands of pages.

For enterprise teams, structured data is a cross-functional capability that prepares web data to be consumed by internal AI applications.

What To Do Next To Prepare Your Content For AI

Enterprise teams can align their content strategies with AI requirements. Here’s how to get started:

1. Audit your current structured data to identify gaps in coverage and whether schema markup is defining relationships within your website. This context is critical for AI inferencing.

2. Map your brand’s key entities, such as products, services, people, and core topics, and ensure they are clearly defined and consistently marked up with schema markup across your content. This includes identifying the main page that defines an entity, known as the entity home.

3. Build or expand your content knowledge graph by connecting related entities and establishing relationships that AI systems can understand.

4. Integrate structured data into AI budget and planning, alongside other AI investments and that content is intended for AI Overviews, chatbots, or internal AI initiatives.

5. Operationalize schema markup management by developing repeatable workflows for creating, reviewing, and updating schema markup at scale.

By taking these steps, enterprises can ensure that their data is AI-ready, inside and outside the enterprise.

Structured Data Provides A Machine-Readable Layer

Structured data doesn’t assure placement in AI Overviews or directly control what large language models say about your brand. LLMs are still primarily trained on unstructured text, and AI systems weigh many signals when generating answers.

What structured data does provide is a strategic, machine-readable layer. When used to build a knowledge graph, schema markup defines entities and the relationships between them, creating a reliable framework that AI systems can draw from. This reduces ambiguity, strengthens attribution, and makes it easier to ground outputs in fact-based content when structured data is part of a connected retrieval or grounding system.

By investing in semantic, large-scale schema markup and aligning it across teams, organizations position themselves to be as discoverable in AI experiences as possible.

More Resources:


Featured Image: Koto Amatsukami/Shutterstock

AI Search Optimization: Make Your Structured Data Accessible

An investigation reveals AI crawlers miss JavaScript-injected structured data. Use server-side rendering or static HTML to ensure visibility.

Matt G. Southern Matt G. Southern 7.6K Reads
AI Search Optimization: Make Your Structured Data Accessible

A recent investigation has uncovered a problem for websites relying on JavaScript for structured data.

This data, often in JSON-LD format, is difficult for AI crawlers to access if not in the initial HTML response.

Crawlers like GPTBot (used by ChatGPT), ClaudeBot, and PerplexityBot can’t execute JavaScript and miss any structured data added later.

This creates challenges for websites using tools like Google Tag Manager (GTM) to insert JSON-LD on the client side, as many AI crawlers can’t read dynamically generated content.

Key Findings About JSON-LD & AI Crawlers

Elie Berreby, the founder of SEM King, examined what happens when JSON-LD is added using Google Tag Manager (GTM) without server-side rendering (SSR).

He found out why this type of structured data is often not seen by AI crawlers:

  1. Initial HTML Load: When a crawler requests a webpage, the server returns the first HTML version. If structured data is added with JavaScript, it won’t be in this initial response.
  2. Client-Side JavaScript Execution: JavaScript runs in the browser and changes the Document Object Model (DOM) for users. At this stage, GTM can add JSON-LD to the DOM.
  3. Crawlers Without JavaScript Rendering: AI crawlers that can’t run JavaScript cannot see changes in the DOM. This means they miss any JSON-LD added after the page loads.

In summary, structured data added only through client-side JavaScript is invisible to most AI crawlers.

Why Traditional Search Engines Are Different

Traditional search crawlers like Googlebot can read JavaScript and process changes made to a webpage after it loads, including JSON-LD data injected by Google Tag Manager (GTM).

In contrast, many AI crawlers can’t read JavaScript and only see the raw HTML from the server. As a result, they miss dynamically added content, like JSON-LD.

Related: Introducing SEOntology: The Future Of SEO In The Age Of AI

Google’s Warning on Overusing JavaScript

This challenge ties into a broader warning from Google about the overuse of JavaScript.

In a recent podcast, Google’s Search Relations team discussed the growing reliance on JavaScript. While it enables dynamic features, it’s not always ideal for essential SEO elements like structured data.

Martin Splitt, Google’s Search Developer Advocate, explained that websites range from simple pages to complex applications. It’s important to balance JavaScript use with making key content available in the initial HTML.

John Mueller, another Google Search Advocate, agreed, noting that developers often turn to JavaScript when simpler options, like static HTML, would be more effective.

What To Do Instead

Developers and SEO professionals should ensure structured data is accessible to all crawlers to avoid issues with AI search crawlers.

Here are some key strategies:

  1. Server-Side Rendering (SSR): Render pages on the server to include structured data in the initial HTML response.
  2. Static HTML: Use schema markup directly in the HTML to limit reliance on JavaScript.
  3. Prerendering: Offer prerendered pages where JavaScript has already been executed, providing crawlers with fully rendered HTML.

These approaches align with Google’s advice to prioritize HTML-first development and include important content like structured data in the initial server response.

Ebook: SEO In The Age of AI

Why This Matters

AI crawlers will only grow in importance, and they play by different rules than traditional search engines.

If your site depends on GTM or other client-side JavaScript for structured data, you’re missing out on opportunities to rank in AI-driven search results.

By shifting to server-side or static solutions, you can future-proof your site and ensure visibility in traditional and AI searches.


Featured Image: nexusby/Shutterstock

Sponsored
Master front-end and back-end tactics that boost visibility in AI Overviews and LLMs and optimize your brand’s search performance today.
Claim Your Spot in AI Search
The AI Search Blueprint: What Works Now (And What’s Holding You Back)
In partnership with Rundown

Optimizing for AI search does not mean starting from scratch. This stack gives SEO leaders a clear, updated blueprint on what matters now vs. what to skip.

You’ll learn to:

  • Gain SERP visibility by structuring content around key AI entities
  • Structure your data without adding more JavaScript
  • Use IndexNow to get discovered faster across AI search
  • Uncover pitfalls on why your brand is (or isn’t) being cited

By clicking the “Submit” button, I agree to the terms of the Alpha Brand Media content agreement and privacy policy.

Search Engine Journal uses the information you provide to contact you about our relevant content and promotions. Search Engine Journal will share the information you provide with the following sponsors, who will use your information for similar purposes: Outerbox. You can unsubscribe from communications from Search Engine Journal at any time.

Unlock this exclusive article stack.

By clicking the “Submit” button, I agree to the terms of the Alpha Brand Media content agreement and privacy policy.

Search Engine Journal uses the information you provide to contact you about our relevant content and promotions. Search Engine Journal will share the information you provide with the following sponsors, who will use your information for similar purposes: Outerbox. You can unsubscribe from communications from Search Engine Journal at any time.