Getting your webpages indexed by Google (and other search engines) is essential. Pages that aren’t indexed can’t rank.
How do you see how many pages you have indexed? You can
- Use the site: operator.
- Check the status of your XML Sitemap Submissions in Google Search Console.
- Check your overall indexation status.
Each will give different numbers, but why they are different is another story.
For now, let’s just talk about analyzing a decrease in the number of indexed pages reported by Google.
If your pages aren’t being indexed, this could be a sign that Google may not like your page or may not be able to easily crawl it. Therefore, if your indexed page count begins to decrease, this could be because either:
- You’ve been slapped with a Google penalty.
- Google thinks your pages are irrelevant.
- Google can’t crawl your pages.
Here are a few tips on how to diagnose and fix the issue of decreasing numbers of indexed pages.
1. Are the Pages Loading Properly?
Make sure they have the proper 200 HTTP Header Status.
Did the server experience frequent or long downtime? Did the domain recently expire and was renewed late?
You can use a free HTTP Header Status checking tool to determine whether the proper status is there. For massive sites, typical crawling tools like Xenu, DeepCrawl, Screaming Frog, or Botify can test these.
The correct header status is 200. Sometimes some 3xx (except the 301), 4xx, or 5xx errors may appear – none of these are good news for the URLs you want to be indexed.
2. Did Your URLs Change Recently?
Sometimes a change in CMS, backend programming, or server setting that results in a change in domain, subdomain, or folder may consequently change the URLs of a site.
Search engines may remember the old URLs but, if they don’t redirect properly, a lot of pages can become de-indexed.
Hopefully a copy of the old site can still be visited in some way or form to take note of all old URLs so you can map out the 301 redirects to the corresponding URLs.
3. Did You Fix Duplicate Content Issues?
Fixing duplicate content often involves implementing canonical tags, 301 redirects, noindex meta tags, or disallows in robots.txt. All of which can result in a decrease in indexed URLs.
This is one example where the decrease in indexed pages might be a good thing.
Since this is good for your site, the only thing you need to do is to double check that this is definitely the cause of the decrease of indexed pages and not anything else.
4. Are Your Pages Timing Out?
Some servers have bandwidth restrictions because of the associated cost that comes with a higher bandwidth; these servers may need to be upgraded. Sometimes, the issue is hardware related and can be resolved by upgrading your hardware processing or memory limitation.
Some sites block IP addresses when visitors access too many pages at a certain rate. This setting is a strict way to avoid any DDOS hacking attempts but it can also have a negative impact on your site.
Typically, this is monitored at a page’s second setting and if the threshold is too low, normal search engine bot crawling may hit the threshold and the bots cannot crawl the site properly.
If this is a server bandwidth limitation, then it might be an appropriate time to upgrade services.
If it is a server processing/memory issue, aside from upgrading the hardware, double check if you have any kind of server caching technology in place, this will give less stress on the server.
If an anti-DDOS software is in place, either relax the settings or whitelist Googlebot to not be blocked anytime. Beware though, there are some fake Googlebots out there; be sure to detect googlebot properly. Detecting Bingbot has a similar procedure.
5. Do Search Engine Bots See Your Site Differently?
Sometimes what search engine spiders see is different than what we see.
Some developers build sites in a preferred way without knowing the SEO implications.
Occasionally, a preferred out-of-the-box CMS will be used without checking if it is search engine friendly.
Sometimes, it might have been done on purpose by an SEO who attempted to do content cloaking, trying to game the search engines.
Other times, the website has been compromised by hackers, who cause a different page to be shown to Google to promote their hidden links or cloak the 301 redirections to their own site.
The worse situation would be pages that are infected with some type of malware that Google automatically deindexes the page immediately once detected.
Using Google Search Console’s fetch and render feature is the best way to see if Googlebot is seeing the same content as you are.
You may also try to translate the page in Google Translate even if you have no intention to translate the language or check Google’s Cached page, but there are also ways around these to still cloak content behind them.
Index Pages Are Not Used as Typical KPIs
Key Performance Indicators (KPIs), which help measure the success of an SEO campaign, often revolve around organic search traffic and ranking. KPIs tend to focus on the goals of a business, which are tied to revenue.
An increase in indexed pages pages may increase the possible number of keywords you can rank for that can result in higher profits. However, the point of looking at indexed pages is mainly just to see whether search engines are able to crawl and indexed your pages properly.
Remember, your pages can’t rank when search engines can’t see, crawl, or index them.
A Decrease in Indexed Pages Isn’t Always Bad
Most of the time, a decrease in indexed pages could mean a bad thing, but a fix to duplicate content, thin content, or low-quality content might also result in a decreased number of indexed pages, which is a good thing.
Learn how to evaluate your site by looking at these five possible reasons why your indexed pages are going down.
Subscribe to SEJ
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!