Google Search Central APAC 2025: Everything From Day 2

The second day at SCL was packed with breaking news about the Google Trends API launch and insights about links, Schema, and the use of AI imagery.

Cherry Prommawin said links are still an important part of the internet and Google use them for ranking.
Gary Illyes said excessive or redundant schema only adds page bloat and is not used as part of the ranking process.
Ian Huang said Google doesn't care whether your images are generated by AI or humans.
Daniel Waisberg and Hadas Jacobi unveiled the new Google Trends API (Alpha).

VIP CONTRIBUTOR Dan Taylor

July 24, 2025
⋅
8 min read

VIP CONTRIBUTOR Dan Taylor Partner & Head of Technical SEO at SALT.agency

Bio

25

SHARES
999

READS

Google Search Central APAC 2025: Everything From Day 2

The second day of the Google Search Central Live APAC 2025 kicked off with a brief tie‑in to the previous day’s deep dive into crawling, before moving squarely into indexing.

Cherry Prommawin opened by walking us through how Google parses HTML and highlights the key stages in indexing:

HTML parsing.
Rendering and JavaScript execution.
Deduplication.
Feature extraction.
Signal extraction.

This set the theme for the rest of the day.

Cherry noted that Google first normalizes the raw HTML into a DOM, then looks for header and navigation elements, and determines which section holds the main content. During this process, it also extracts elements such as rel=canonical, hreflang, links and anchors, and meta-robots tags.

“There is no preference between responsive websites versus dynamic/adaptive websites. Google doesn’t try to detect this and doesn’t have a preferential weighting.” – Cherry Prommawin

Links remain central to the web’s structure, both for discovery and for ranking:

“Links are still an important part of the internet and used to discover new pages, and to determine site structure, and we use them for ranking.” – Cherry Prommawin

Controlling Indexing With Robots Rules

Gary Illyes clarified where robots.txt and robots‑meta tags fit into the flow:

Robots.txt controls what crawlers can fetch.
Meta robot tags control how that fetched data is used downstream.

He highlighted several lesser‑known directives:

none: Equivalent to noindex,nofollow combined into a single rule. Is there a benefit to this? While functionally identical, using one directive instead of two may simplify tag management.
notranslate: If set, Chrome will no longer offer to translate the page.
noimageindex: Also applies to video assets.
Unavailable after: Despite being introduced by engineers who have since moved on, it still works. This could be useful for deprecating time‑sensitive blog posts, such as limited‑time deals and promotions, so they don’t persist in Google’s AI features and risk misleading users or harming brand perception.

Understanding What’s On A Page

Gary Illyes emphasized that the main content, as defined by Google’s Quality Rater Guidelines, is the most critical element in crawling and indexing. It might be text, images, videos, or rich features like calculators.

He showed how shifting a topic into the main content area can boost rankings.

In one example, moving references to “Hugo 7” from a sidebar into the central (main) content led to a measurable increase in visibility.

“If you want to rank for certain things, put those words and topics in important places (on the page).” – Gary Illyes

Tokenization For Search

You can’t dump raw HTML into a searchable index at scale. Google breaks it into “tokens,” individual words or phrases, and stores those in its index.

The first HTML segmentation system dates back to Google’s 2001 Tokyo engineering office, and the same tokenization methods power its AI products, since “why reinvent the wheel.”

When the main content is thin or low value, what Google labels as a “soft 404,” it’s flagged with a centerpiece annotation to show that this deficiency is at the heart of the page, not just in a peripheral section.

Handling Web Duplication

Image from author, July 2025

Cherry Prommawin explained deduplication in three focus areas:

Clustering: Using redirects, content similarity, and rel=canonical to group duplicate pages.
Content checks: Checksums that ignore boilerplate and catch many soft‑error pages. Note that soft errors can bring down an entire cluster.
Localization: When pages differ only by locale (for example via geo‑redirects), hreflang bridges them without penalty.

She contrasted permanent versus temporary redirects: Both play a role in crawling and clustering, but only permanent redirects influence which URL is chosen as the cluster’s canonical.

Google prioritizes hijacking risk first, user experience second, and site-owner signals (such as your rel=canonical) third when selecting the representative URL.

Geotargeting

Geotargeting allows you to signal to Google which country or region your content is most relevant for, and it works differently from simple language targeting.

Prommawin emphasized that you don’t need to hide duplicate content across two country‑specific sites; hreflang will handle those alternates for you.

Image from author, July 2025

If you serve the duplicate content on multiple regional URLs without localization, you risk confusing both crawlers and users.

To geotarget effectively, ensure that each version has unique, localized content tailored to its specific audience.

The primary geotargeting signals Google uses are:

Country‑code top‑level domain (ccTLD): Domains like .sg or .au indicate the target country.
Hreflang annotations: Use <link> tags, HTTP headers, or sitemap entries to declare language and regional alternates.
Server location: The IP address or hosting location of your server can act as a geographic hint.
Additional local signals, such as language and currency on the page, links from other regional websites, and signals from your local Business Profile, all reinforce your target region.

By combining these signals with genuinely localized content, you help Google serve the right version of your site to the right users, and avoid the pitfalls of unintended duplicate‑content clusters.

Structured Data & Media

Gary Illyes introduced the feature extraction phase, which runs after deduplication and is computationally expensive. It starts with HTML, then kicks off separate, asynchronous media indexing for images and videos.

If your HTML is in the index but your media isn’t, it simply means the media pipeline is still working.

Sessions in this track included:

Structured Data with William Prabowo.
Using Images with Ian Huang.
Engaging Users with Video with William Prabowo.

Q&A Takeaway On Schema

Schema markup can help Google understand the relationships between entities and enable LLM-driven features.

But, excessive or redundant schema only adds page bloat and has no additional ranking benefits. And Schema is not used as part of the ranking process.

Calculating Signals

During signal extraction, also part of indexing, Google computes a mix of:

Indirect signals (links, mentions by other pages).
Direct signals (on‑page words and placements).

Image from author, July 2025

Illyes confirmed that Google still uses PageRank internally. It is not the exact algorithm from the 1996 White Paper, but it bears the same name.

Handling Spam

Google’s systems identify around 40 billion spam pages each day, powered by their LLM‑based “SpamBrain.”

Image from author, July 2025

Additionally, Illyes emphasized that E-E-A-T is not an indexing or ranking signal. It’s an explanatory principle, not a computed metric.

Deciding What Gets Indexed

Index selection boils down to quality, defined as a combination of trustworthiness and utility for end users. Pages are dropped from the index for clear negative signals:

noindex directives.
Expired or time‑limited content.
Soft 404s and slipped‑through duplicates.
Pure spam or policy violations.

If a page has been crawled but not indexed, the remedy is to improve the content quality.

Internal linking can help, but only insofar as it makes the page genuinely more useful. Google’s goal is to reward user‑focused improvements, not signal manipulation.

Google Doesn’t Care If Your Images Are AI-Generated

AI-generated images have become common in marketing, education, and design workflows. These visuals are produced by deep learning models trained on massive picture collections.

During the session, Huang outlined that Google doesn’t care whether your images are generated by AI or humans, as long as they accurately and effectively convey the information or tell the story you intend.

As long as images are understandable, their AI origins are irrelevant. The primary goal is effective communication with your audience.

Huang highlighted an example of an AI image used by the Google team during the first day of the conference that, on close inspection, does have some visual errors, but as a “prop,” its job was to represent a timeline and was not the main content of the slide, so these errors do not matter.

Image from author, July 2025

We can adopt a similar approach to our use of AI-generated imagery. If the image conveys the message and isn’t the main content of the page, minor issues won’t lead to penalization, nor will using AI-generated imagery in general.

Images should undergo a quick human review to identify obvious mistakes, which can prevent production errors.

Ongoing oversight remains essential to maintain trust in your visuals and protect your brand’s integrity.

Google Trends API Announced

Finally, Daniel Waisberg and Hadas Jacobi unveiled the new Google Trends API (Alpha). Key features of the new API will include:

Consistently scaled search interest data that does not recalibrate when you change queries.
A five‑year rolling window, updated up to 48 hours ago, for seasonal and historical comparisons.
Flexible time aggregation (weekly, monthly, yearly).
Region and sub‑region breakdowns.

This opens up a world of programmatic trend analysis with reliable, comparable metrics over time.

That wraps up day two. Tomorrow, we have coverage of the final day three at Google Search Central Live, with more breaking news and insights.

More Resources:

Featured Image: Dan Taylor/SALT.agency

Category SEO

The Ultimate Topic Cluster Cheat Sheet & Checklist Bundle

Put Your Brand in Front of 1M Engaged SEO Pros

Proven AI Visibility: From SEO Strategy To GEO Tactics

Your Content Deserves A Bigger Stage. Try Our 1M+ Reach.

Expand Reach Beyond Social Media Timelines.

Your Content Deserves A Bigger Stage. Try Our 1M+ Reach.

Google Search Central APAC 2025: Everything From Day 2

Controlling Indexing With Robots Rules

Understanding What’s On A Page

Tokenization For Search

Handling Web Duplication

Geotargeting

Structured Data & Media

Q&A Takeaway On Schema

Calculating Signals

Handling Spam

Deciding What Gets Indexed

Google Doesn’t Care If Your Images Are AI-Generated

Google Trends API Announced