Google officially announced that GoogleBot will no longer obey a Robots.txt directive related to indexing. Publishers relying on the robots.txt noindex directive have until September 1, 2019 to remove it and begin using an alternative.
Robots.txt Noindex Unofficial
The reason the noindex robots.txt directive won’t be supported is because it’s not an official directive.
Google has in the past supported this robots.txt directive but this will no longer be the case. Take due notice therof and govern yourself accordingly.
Google Mostly Used to Obey Noindex Directive
StoneTemple published an article noting that Google mostly obeyed the robots.txt noindex directive.
Their conclusion at the time was:
“Ultimately, the NoIndex directive in Robots.txt is pretty effective. It worked in 11 out of 12 cases we tested. It might work for your site, and because of how it’s implemented it gives you a path to prevent crawling of a page AND also have it removed from the index.
That’s pretty useful in concept. However, our tests didn’t show 100 percent success, so it does not always work.”
That’s no longer the case. The noindex robots.txt directive is no longer supported.
This is Google’s official tweet:
“Today we’re saying goodbye to undocumented and unsupported rules in robots.txt
If you were relying on these rules, learn about your options in our blog post.”
This is the relevant part of the announcement:
“In the interest of maintaining a healthy ecosystem and preparing for potential future open source releases, we’re retiring all code that handles unsupported and unpublished rules (such as noindex) on September 1, 2019. “
How to Control Crawling?
Google’s official blog post listed five ways to control indexing:
- Noindex in robots meta tags
- 404 and 410 HTTP status codes
- Password protection
- Disallow in robots.txt
- Search Console Remove URL tool
Read the official Google announcement here:
Read the official Google tweet here