John Mueller, a webmaster trends analyst at Google, recently explained how Googlebot finds sites when there are no links pointing to them.
“How does Googlebot find a site if no one is linking to the site, and it’s not been submitted to Search Console?”
In response, Mueller says it’s “tricky” to determine exactly how these sites are found.
Some possibilities include:
- Third parties that track domain registrations (with links)
- Accidental backlinks caused by typos in the URL
- Toolbars that link to related content
- The CMS may have generated a sitemap or RSS/Atom feed
If you absolutely do not want a site to be found, Mueller says to use the noindex tag. Don’t assume that search engines won’t find a site just because it hasn’t been promoted or linked to.
Mueller also provided recommendations for site owners who wish to do the opposite by launching a new site with maximum impact:
“If you want to launch something new with a bang (assuming that’s what you’re trying to do with a new & unknown domain), one idea could be to use the site removal tool to hide the site in search, and then to cancel that request when you’re making it live — that lets Google crawl & index the content ahead of time, but prevents it from being shown in search.”
The above method is faster than switching from noindex to indexable content for search, but there’s no guarantee that it won’t be found by search engines other than Google.
Your only option to guarantee a site won’t be found by crawlers is to use a noindex tag.