How to Get Google to Index Your Site Deeper

SMS Text

Very often, especially with huge dynamically driven or user-generated websites, you might feel you have some indexing problems, i.e. Google doesn’t seem to dig your site as deep as you want it to. The remedies to this may vary but here are some essential basics you might want to look into:

  • Make sure you really have indexing problems. For some reason, in many cases, Google site: operator won’t show you the real number how many URLs the search engine has indexed. You should thoroughly explore how deep your site has been indexed in reality before arriving at any conclusions. Here are some tips to do that (also mentioned at SEOmoz):
    • check how many pages are indexed in each directory (or subdirectory: the deeper you dig, the more accurate the results are): site:yoursite.com/subdirectory/sub-subdirectory1 + site:yoursite.com/subdirectory/sub-subdirectory2, etc.
    • search for subdirectory-specific keywords: site:yoursite.com inurl:subdirectory (or site:yoursite.com intitle:subdirectory);
    • check recently indexed pages (take advantage of “date range” option via advanced search).

    Having chosen any of these methods, carefully collect the data and in the end you will get a more accurate number of website indexed pages.

  • Try to identify any non-indexing patterns. Which type(s) of pages or subdirectories are left out without attention: try to sort out any non-indexing rules or logic? By doing this you will be able to determine the issues that must have caused the indexing issues: duplicate content probably or incorrect internal architecture. Try to draw your site main interlinking structure and find pages that get inadequate link juice.
  • Work on your site external deep linking ratio: deep link to your subdirectories from external resources (quality deep linking directories should still help with that).
  • Mind your site crawl rate: like I said in my post on improving crawl rate, Googlebot “works on a budget”: if you keep it busy crawling huge files or waiting for your page to load or following duplicate content URLs, you might be missing the chance to show it your other pages.
Ann Smarty
Ann Smarty is the blogger and community manager at Internet Marketing Ninjas. Ann's expertise in blogging and tools serve as a base for her writing, tutorials and her guest blogging project, MyBlogGuest.com.
Ann Smarty
Get the latest news from Search Engine Journal!
We value your privacy! See our policy here.
  • Absolutely agree with monitor any no index anomalies. It’s always funny to me when a company says that none of their “blog articles” are indexed and they wonder why? So they ask before they even look at there robots file… no index !?

  • absolutely true…
    we should alaways check the pages that have been indexed and also others that may bring us targetted traffic.

  • It’s very important to check that the count of indexed pages doesn’t drop after any architectural change, so I have a crawler that I run against my site that outputs a text file of pages it finds. It obeys all the nofollow and robots.txt rules and even processes some javascript so that Google Analytics gets pinged for each page.

    Giving Google hints via Analytics is an excellent way of getting new pages into the index quickly.

  • Hi Ann,

    like you said that GBot works on a budget, do you have any more suggestions on what are the things we can make such short visit effective for our blog’s indexing?

  • @Louis, please refer to this article on improving crawl rate that lists some recommendations on that.

  • Thanks for sharing these points! This information is so true, I plan on using this post as a helpful tip sheet for our SEO staff…thanks again!

  • thank you for sharing

  • I had never thought about the external deep linking strategy before.

    The other stuff is pretty basic, and frustrating. Checking every day to see if Google indexed your site is annoying.

    Thanks for the tips!

  • ‘Work on your site external deep linking ratio’

    Hey Ann, any tips for this? This has always been difficult for me outside of the social bookmarking sites. Is always easy to get homepage links, but even at SES San Jose, I couldn’t get more than the usual (digg, sphinn, buzz, ma.gnolia.com) answer…

  • Hey great information. I am currently working on 3 online stores so this info will be most helpful.

  • @ Nikki that’s pretty aggressive what are the size of the stores? Number of products? And what framework are they built on? Interested (e-commerce optimization is all I used to do). Have you seen Magento yet? SEO slickness.

  • Could just use an xml sitemap and submit to each of the engines Webmaster tools. This will ensure all your pages are indexed.

    Deep linking is such a con term. There is either a link to the page, or there is not. Its not deep or shallow linking it is nothing but a hyper link, some links use anchor text and others do not.

  • @Clint, no, officially XML sitemap does not help in indexing – it only helps for discovery…

  • @Conrad, the answer would largely depend on your site and niche. Any “universal” method (like deep linking directories) is no more as effective as it used to be…

  • @ Ann

    Might want to read the home page of this site.

    http://www.sitemaps.org/

    ;->

    Clint

  • @Clint: yep, just what I said:

    “Using the Sitemap protocol does not guarantee that web pages are included in search engines”

    – so no indexing help.

    Besides, Google’s official policy is the same:

    “Using this protocol does not guarantee that your webpages will be included in search indexes”.

  • @ Ann

    Yes of course the search engines Google especially are going to have a disclaimer that a sitemap submission in & of itself will not ensure indexing of pages.

    This is due to the amount of spam that still chokes the search engines.

    However for over 25 of my own sites the submission has worked as it should with all pages indexed as one would like.

  • @Clint, absolutely right Clint. It’s funny how people really feel that disclaimers aren’t “fair”

  • @Clint,

    “the submission has worked as it should”

    What do you mean by that if the link that you yourself cited above clearly stated that it should only help in discovery…

  • @Buffalo…..thank you…some still think Google Organic owes them something too lol…

    @Ann

    Once discovered the pages would then be indexed. (Else why were they discovered?) Without discovery there would be no indexing….

    can’t hit what you can’t see so to speak.

  • @Clint, I am sorry to say, but discovery and including into index are two different processes and the first one doesn’t necessarily mean the second one.

  • @ Ann

    I see…

    Then could you be so kind and explain how a page could be indexed, that had not been discovered by the crawler bot?

  • No, Clint: “the first one [discovery] doesn’t necessarily mean the second one [indexing]” meaning that if discovered the page won’t necessarily be included into index. Many factors might come into play: a search engine might view it as a duplicate or just find not enough crawlable content to keep it…

  • Hi Ann

    Actually to determine duplicate content the page would need to be indexed and included in the word barrels so that the search engine can scan their database to determine possible content duplication.

    On your latter issue I would agree errors and not enough content for indexing could lead to non indexation.

    We can then say safely that Discovery is needed to determine indexing, or non-indexing. But there cannot be indexing or non-indexing, without Discovery.

    Discovery can occur via three separate methods, spider crawls server & finds page(s), spider follows external link to the page(s), spider follows submitted sitemap url to page(s).

    Hope this clears things up.

  • Hello, Ann. Thank you for sharing. It is realy important to get the site indexed well.

  • This is really a nice post. I really like this idea. So I subscribed you…

  • Software Testing Training

    Thanks for Article……

    Good Article Writing

    Keep them coming….

    Software Testing Training
    http://www.qacampus.com

    Our Software Testing Partner
    http://www.crestech.in

  • Great article, deeper, deeper!!

  • Nice tips, thanks very much for the info ­čÖé

  • Thanks… great article, I’ll be sure to double check my site.

  • marckdon

    A good sitemap and good backlinks.

  • marckdon

    A good sitemap and good backlinks.

  • I also face the problem of delayed indexing , but when I submit my contents to social bookmarking sites , the url gets indexed quickly