Does Google show preference to some top level domains? Search marketer Bill Hartzer set out to find out how Google treats different domains in a research experiment with surprising results.
It should be noted that experiments and tests based on nonsense words (nonexistent words) may not provide actionable insights into Google’s algorithm.
Some believe there are insights about Google’s algorithm that can be glimpsed in “research” on search results for nonsense words.
Others understand that there are flaws in research based on nonsense words.
The reason is because the full weight of Google’s algorithm, the links, the AI, machine learning, can’t be used on nonsense words because the algorithms are trained actual words.
There is no search data, no click data, no relevance data for nonsense words. This is called Data Scarcity.
That’s why Google’s algorithms sometimes fail at searches for which there is no history of searches and no pages that are relevant for that query.
How can algorithms trained on actual words, including the BERT algorithm, deal with non-existent words?
That’s a caveat about these kinds of experiments that use nonsense words, which it is understood may make them unsuitable for making assumptions about Google’s algorithm.
But others believe otherwise. You make up your own mind.
The Research Experiment
The person who conducted the research is Bill Hartzer (@bhartzer). Bill is a respected search marketer with over 25 years of experience. Bill has a proven history stretching back decades that demonstrate a solid track record of being accurate in his assessments.
The research had the following features:
- 15 domains on 15 different TLDs, each domain name used the same nonsense phrase, “nocseman.”
- Each site was 25 pages
- Each site had unique content
- Each site had one target keyword for their site (a made up nonsense word that was not nocseman)
Results and Claims
Here are the results claimed by the research:
- Google reads the keyword in the domain name.
- Bill Hartzer also found that for new domain names and websites it’s hard to get Google to index their pages. Bill found that it was necessary to verify the site in Google Search Console and and use the URL Inspector Tool to get the sites indexed. The URL Inspector Tool puts the URL in a queue to be indexed.
- Google by default appeared to consider dot CO (.CO) sites to be in Spanish even when there was no Spanish content on the page and without displaying hreflang tags.
- Similarly, .DE sites were automatically considered to be in the German language unless specified to be another language.
- Bill also discovered that verifying the site in Google Search Console and manually submitting one page with the URL Inspection Tool was enough to get the entire website indexed. In other words, it was not necessary to manually submit every page one at a time.
What some might find controversial is that Bill asserts that the research shows Google has a bias toward certain top level domains. A top level domain is .com, .net, etc.
So I asked Bill some questions about his research results.
This is what Bill Hartzer said:
“It’s clear that Google does, for whatever reason, prefer certain TLDs over others when it comes to indexing pages and initial keyword rankings.”
One problem I have with the research is that all the domains share the same name, nocseman. That seems to create the issue of each of those sites competing against each other for the made up word (nocseman) and possibly for that reason causing Google to choose one as the authoritative domain for that made up word, which might appear from the outside as a bias.
“I am not sure if that’s necessarily a flaw. In this case, before the results (and the keyword was made public), the .net domain name ranked #1, and we’re seeing changes over time.
The fact that the keyword nocseman was only appearing in the domain name, that wasn’t the sole test here.
The SEO testing that was done was to look at indexing rates of each of the 25 page sites, and then initial rankings of each keyword assigned to each domain name.
So, each domain name had its own keyword assigned to it.
That we can check for nocseman as a keyword in the domain name is only a bonus of the testing.”
Google generally ranks pages and keywords using signals like the meaning of the content as well as external factors such as user preferences, user intent, links, etc.
Don’t you think that a made up keyword will not get folded into Google’s index simply because there is no data for Google to understand it with? And that the lack of that data that helps Google understand a web page and a website might negatively impact the results?
Surely, because what is being tested is abnormal and does not reflect real-world ranking processes, this might result in unreliable results?
“The testing shows that the made up word does get into the results, and for some sites it ranked really quickly for the made up word.
I think the point of the test, though, mainly was to test the TLDs in an unbiased way.
Of course all the content is nonsense content and not even optimized per se. And those sites are “ranking” for nocseman without it even being mentioned in the content.”
What would you say to the opinion that by using nonsense words you are not really testing Google’s normal indexing processes?
“It tests their normal indexing processes, but not necessarily other factors that come into play with rankings.
It tests brand new words (which occur all the time) and brand new domains. So they don’t have info about those words–so they have to figure it out quickly somehow.”
My understanding based on observations of how Google ranks very obscure phrases that lack web pages for giving Google context and data to understand it, that Google seems to default to keyword based rankings, almost like a fallback approach. It’s like Google might say, “I don’t understand this so I’ll just show pages that have this keyword phrase in it.”
“Maybe that’s something we have learned out of this: that they cannot effectively rank brand new words? I am not sure if they default to keyword ranking or not.
Once the nocseman keyword was made public and it’s mentioned on other sites, the rankings are all over the place.”
You’re right! It’s almost like a door opened to researching how, with more data, Google begins ranking web pages that are about an obscure word.
It would be interesting to see a follow up test to see if similar results happen. Scientific research gains credibility when the results can be reproduced.
Nevertheless, Bill Hartzer discovered interesting insights into how Google might be slow to index new sites, how Google might show bias toward some top level domains and many other observations. I’m not entirely convinced because I’d like to see if the results can be reproduced.
But that’s just me. Read his article and make up your own mind about a possible bias at Google: Top Level Domain Bias in Search Engine Indexing and Rankings