Google Search Advocate John Mueller has pushed back on the idea of building separate Markdown or JSON pages just for large language models (LLMs), saying he doesn’t see why LLMs would need pages that no one else sees.
The discussion started when Lily Ray asked on Bluesky about “creating separate markdown / JSON pages for LLMs and serving those URLs to bots,” and whether Google could share its perspective.
Not sure if you can answer, but starting to hear a lot about creating separate markdown / JSON pages for LLMs and serving those URLs to bots. Can you share Googleʼs perspective on this?
The question draws attention to a developing trend where publishers create “shadow” copies of important in formats that are easier for AI systems to understand.
There’s a more active discussion on this topic happening on X.
This has been the hot topic lately, I’ve been getting pitched by companies who do this https://t.co/rVnbPKUxZj
— Lily Ray 😏 (@lilyraynyc) November 23, 2025
What Mueller Said About LLM-Only Pages
Mueller replied that he isn’t aware of anything on Google’s side that would call for this kind of setup.
He notes that LLMs have worked with regular web pages from the beginning:
I’m not aware of anything in that regard. In my POV, LLMs have trained on – read & parsed – normal web pages since the beginning, it seems a given that they have no problems dealing with HTML. Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?
When Ray followed up about whether a separate format might help “expedite getting key points across to LLMs quickly,” Mueller argued that if file formats made a meaningful difference, you would likely hear that directly from the companies running those systems.
If those creating and running these systems knew they could create better responses from sites with specific file formats, I expect they would be very vocal about that. AI companies aren’t really known for being shy.
He said some pages may still work better for AI systems than others, but he doesn’t think that comes down to HTML versus Markdown:
That said I can imagine some pages working better for users and some better for AI systems, but I doubt that’s due to the file format, and it’s definitely not generalizable to everything. (Excluding JS which still seems hard for many of these systems).”
Taken together, Mueller’s comments suggest that, from Google’s point of view, you don’t need to create bot-only Markdown or JSON clones of existing pages just to be understood by LLMs.
How Structured Data Fits In
Other individuals in the thread drew a line between speculative “shadow” formats and cases where AI platforms have clearly defined feed requirements.
A reply from Matt Wright pointed to OpenAI’s eCommerce product feeds as an example where JSON schemas matter.
In that context, a defined spec governs how ChatGPT ingests and displays product data. Wright explains:
Interestingly, the OpenAI eCommerce product feeds are live: JSON schemas appear to have a key role in AI search already.
That example supports the idea that structured feeds and schemas are most important when a platform publishes a spec and asks you to use it.
Additionally, Wright points to a thread on LinkedIn where Chris Long observed that “editorial sites using product schemas, tend to get included in ChatGPT citations.”
Why This Matters
If you’re questioning whether to build “LLM-optimized” Markdown or JSON versions of your content, this exchange can help steer you back to the basics.
Mueller’s comments reinforce that LLMs have long been able to read and parse standard HTML.
For most sites, it’s more productive to keep improving speed, readability, and content structure on the pages you already have, and to implement schema where there’s clear platform guidance.
At the same time, the Bluesky thread shows that AI-specific formats are starting to emerge in narrow areas such as product feeds. Those are worth tracking, but they’re tied to explicit integrations, not a blanket rule that markdown is better for LLMs.
Looking Ahead
The conversation highlights how fast AI-driven search changes are turning into technical requests for SEO and dev teams, often before there is documentation to support them.
Until LLM providers publish more concrete guidelines, this thread points you back to work you can justify today: keep your HTML clean, reduce unnecessary JavaScript where it makes content hard to parse, and use structured data where platforms have clearly documented schemas.
Featured Image: Roman Samborskyi/Shutterstock