Google’s Mueller Questions Need For LLM-Only Markdown Pages

Google's John Mueller pushes back on building LLM-only Markdown or JSON pages for LLMs, saying clean HTML and structured data should come first.

Google's John Mueller questions the need to create markdown pages for LLMs.
LLMs already handle normal HTML.
He suggests better AI performance is unlikely to come from file format alone.

SEJ STAFF Matt G. Southern

4 seconds ago
⋅
4 min read

SEJ STAFF Matt G. Southern Senior News Writer at Search Engine Journal

Bio

Google’s Mueller Questions Need For LLM-Only Markdown Pages

Google Search Advocate John Mueller has pushed back on the idea of building separate Markdown or JSON pages just for large language models (LLMs), saying he doesn’t see why LLMs would need pages that no one else sees.

The discussion started when Lily Ray asked on Bluesky about “creating separate markdown / JSON pages for LLMs and serving those URLs to bots,” and whether Google could share its perspective.

Ray asked:

Not sure if you can answer, but starting to hear a lot about creating separate markdown / JSON pages for LLMs and serving those URLs to bots. Can you share Googleʼs perspective on this?

The question draws attention to a developing trend where publishers create “shadow” copies of important in formats that are easier for AI systems to understand.

There’s a more active discussion on this topic happening on X.

This has been the hot topic lately, I’ve been getting pitched by companies who do this https://t.co/rVnbPKUxZj

— Lily Ray 😏 (@lilyraynyc) November 23, 2025

What Mueller Said About LLM-Only Pages

Mueller replied that he isn’t aware of anything on Google’s side that would call for this kind of setup.

He notes that LLMs have worked with regular web pages from the beginning:

I’m not aware of anything in that regard. In my POV, LLMs have trained on – read & parsed – normal web pages since the beginning, it seems a given that they have no problems dealing with HTML. Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?

When Ray followed up about whether a separate format might help “expedite getting key points across to LLMs quickly,” Mueller argued that if file formats made a meaningful difference, you would likely hear that directly from the companies running those systems.

Mueller added:

If those creating and running these systems knew they could create better responses from sites with specific file formats, I expect they would be very vocal about that. AI companies aren’t really known for being shy.

He said some pages may still work better for AI systems than others, but he doesn’t think that comes down to HTML versus Markdown:

That said I can imagine some pages working better for users and some better for AI systems, but I doubt that’s due to the file format, and it’s definitely not generalizable to everything. (Excluding JS which still seems hard for many of these systems).”

Taken together, Mueller’s comments suggest that, from Google’s point of view, you don’t need to create bot-only Markdown or JSON clones of existing pages just to be understood by LLMs.

How Structured Data Fits In

Other individuals in the thread drew a line between speculative “shadow” formats and cases where AI platforms have clearly defined feed requirements.

A reply from Matt Wright pointed to OpenAI’s eCommerce product feeds as an example where JSON schemas matter.

In that context, a defined spec governs how ChatGPT ingests and displays product data. Wright explains:

Interestingly, the OpenAI eCommerce product feeds are live: JSON schemas appear to have a key role in AI search already.

That example supports the idea that structured feeds and schemas are most important when a platform publishes a spec and asks you to use it.

Additionally, Wright points to a thread on LinkedIn where Chris Long observed that “editorial sites using product schemas, tend to get included in ChatGPT citations.”

Why This Matters

If you’re questioning whether to build “LLM-optimized” Markdown or JSON versions of your content, this exchange can help steer you back to the basics.

Mueller’s comments reinforce that LLMs have long been able to read and parse standard HTML.

For most sites, it’s more productive to keep improving speed, readability, and content structure on the pages you already have, and to implement schema where there’s clear platform guidance.

At the same time, the Bluesky thread shows that AI-specific formats are starting to emerge in narrow areas such as product feeds. Those are worth tracking, but they’re tied to explicit integrations, not a blanket rule that markdown is better for LLMs.

Looking Ahead

The conversation highlights how fast AI-driven search changes are turning into technical requests for SEO and dev teams, often before there is documentation to support them.

Until LLM providers publish more concrete guidelines, this thread points you back to work you can justify today: keep your HTML clean, reduce unnecessary JavaScript where it makes content hard to parse, and use structured data where platforms have clearly documented schemas.

Featured Image: Roman Samborskyi/Shutterstock

Category News Generative AI

State Of SEO 2026 Report & Survival Guide

SEO Starter Stack: Get Found Without Paying for Ads

Benchmarking the Future of AI Search: 2026 Insights on AEO & AI Overviews

PPC Trends 2026