How to Get Google to Index Your Site Deeper

Very often, especially with huge dynamically driven or user-generated websites, you might feel you have some indexing problems, i.e. Google doesn’t seem to dig your site as deep as you want it to. The remedies to this may vary but here are some essential basics you might want to look into:

  • Make sure you really have indexing problems. For some reason, in many cases, Google site: operator won’t show you the real number how many URLs the search engine has indexed. You should thoroughly explore how deep your site has been indexed in reality before arriving at any conclusions. Here are some tips to do that (also mentioned at SEOmoz):
    • check how many pages are indexed in each directory (or subdirectory: the deeper you dig, the more accurate the results are): +, etc.
    • search for subdirectory-specific keywords: inurl:subdirectory (or intitle:subdirectory);
    • check recently indexed pages (take advantage of “date range” option via advanced search).

    Having chosen any of these methods, carefully collect the data and in the end you will get a more accurate number of website indexed pages.

  • Try to identify any non-indexing patterns. Which type(s) of pages or subdirectories are left out without attention: try to sort out any non-indexing rules or logic? By doing this you will be able to determine the issues that must have caused the indexing issues: duplicate content probably or incorrect internal architecture. Try to draw your site main interlinking structure and find pages that get inadequate link juice.
  • Work on your site external deep linking ratio: deep link to your subdirectories from external resources (quality deep linking directories should still help with that).
  • Mind your site crawl rate: like I said in my post on improving crawl rate, Googlebot “works on a budget”: if you keep it busy crawling huge files or waiting for your page to load or following duplicate content URLs, you might be missing the chance to show it your other pages.
Ann Smarty
Ann Smarty is the blogger and community manager at Internet Marketing Ninjas. Ann's expertise in blogging and tools serve as a base for her writing, tutorials and her guest blogging project,
Ann Smarty

Comments are closed.

33 thoughts on “How to Get Google to Index Your Site Deeper

  1. Absolutely agree with monitor any no index anomalies. It’s always funny to me when a company says that none of their “blog articles” are indexed and they wonder why? So they ask before they even look at there robots file… no index !?

  2. It’s very important to check that the count of indexed pages doesn’t drop after any architectural change, so I have a crawler that I run against my site that outputs a text file of pages it finds. It obeys all the nofollow and robots.txt rules and even processes some javascript so that Google Analytics gets pinged for each page.

    Giving Google hints via Analytics is an excellent way of getting new pages into the index quickly.

  3. Hi Ann,

    like you said that GBot works on a budget, do you have any more suggestions on what are the things we can make such short visit effective for our blog’s indexing?

  4. I had never thought about the external deep linking strategy before.

    The other stuff is pretty basic, and frustrating. Checking every day to see if Google indexed your site is annoying.

    Thanks for the tips!

  5. ‘Work on your site external deep linking ratio’

    Hey Ann, any tips for this? This has always been difficult for me outside of the social bookmarking sites. Is always easy to get homepage links, but even at SES San Jose, I couldn’t get more than the usual (digg, sphinn, buzz, answer…

  6. @ Nikki that’s pretty aggressive what are the size of the stores? Number of products? And what framework are they built on? Interested (e-commerce optimization is all I used to do). Have you seen Magento yet? SEO slickness.

  7. Could just use an xml sitemap and submit to each of the engines Webmaster tools. This will ensure all your pages are indexed.

    Deep linking is such a con term. There is either a link to the page, or there is not. Its not deep or shallow linking it is nothing but a hyper link, some links use anchor text and others do not.

  8. @Conrad, the answer would largely depend on your site and niche. Any “universal” method (like deep linking directories) is no more as effective as it used to be…

  9. @Clint: yep, just what I said:

    “Using the Sitemap protocol does not guarantee that web pages are included in search engines”

    – so no indexing help.

    Besides, Google’s official policy is the same:

    “Using this protocol does not guarantee that your webpages will be included in search indexes”.

  10. @ Ann

    Yes of course the search engines Google especially are going to have a disclaimer that a sitemap submission in & of itself will not ensure indexing of pages.

    This is due to the amount of spam that still chokes the search engines.

    However for over 25 of my own sites the submission has worked as it should with all pages indexed as one would like.

  11. @Clint,

    “the submission has worked as it should”

    What do you mean by that if the link that you yourself cited above clearly stated that it should only help in discovery…

  12. @Buffalo…..thank you…some still think Google Organic owes them something too lol…


    Once discovered the pages would then be indexed. (Else why were they discovered?) Without discovery there would be no indexing….

    can’t hit what you can’t see so to speak.

  13. @Clint, I am sorry to say, but discovery and including into index are two different processes and the first one doesn’t necessarily mean the second one.

  14. No, Clint: “the first one [discovery] doesn’t necessarily mean the second one [indexing]” meaning that if discovered the page won’t necessarily be included into index. Many factors might come into play: a search engine might view it as a duplicate or just find not enough crawlable content to keep it…

  15. Hi Ann

    Actually to determine duplicate content the page would need to be indexed and included in the word barrels so that the search engine can scan their database to determine possible content duplication.

    On your latter issue I would agree errors and not enough content for indexing could lead to non indexation.

    We can then say safely that Discovery is needed to determine indexing, or non-indexing. But there cannot be indexing or non-indexing, without Discovery.

    Discovery can occur via three separate methods, spider crawls server & finds page(s), spider follows external link to the page(s), spider follows submitted sitemap url to page(s).

    Hope this clears things up.