SEO

Gasp! You Still Need to Care about PageRank

Indexation. Ensuring that all your site’s important pages are indexed is absolutely at the root of a sound SEO strategy. More often than not, websites with thousands of pages run the risk of not being able to get all their pages indexed. What’s more is that if this is you, you might not even know it. Why? If you’ve got a huge site, chances are that traffic levels are decent, thus making it difficult to identify such an issue.

Simply put, if your pages aren’t indexed they will never rank in Google. But why would these pages not get indexed?

The number of pages indexed is roughly proportional to your PageRank

Wait, what? I thought PageRank was so 2003. Actually, not so. In a March 2010 interview by Eric Enge, Matt Cutts squashes the commonly held theory that every site has a dedicated “crawl budget” and instead confirms that PageRank might actually play a larger role in terms of indexation.

Typically, the pages buried deep within a site’s architecture will most likely be impacted with indexation problems. These are often product pages and older articles that may even be hard to find on the site itself (think past articles on newspaper sites). This represents a significant lost opportunity for getting traffic from long tail search queries.

Google’s “Mayday” update seems to also confirm this.

This change seems to have primarily impacted very large sites with “item” pages that don’t have many individual links into them, might be several clicks from the home page, and may not have substantial unique and value-added content on them. For instance, ecommerce sites often have this structure. The individual product pages are unlikely to attract external links and the majority of the content may be imported from a manufacturer database.Vanessa Fox on SearchEngineLand

Back in 2009 I wrote about the concept of crawl equity and while those best practices still ring true, it didn’t factor in PageRank. Understanding that PageRank (or lack thereof) can prohibit indexation really just means that the affected pages need more links – both internal and external. The key takeaway here is to formulate a strategy to obtain deep links into your website for better indexation. In addition, develop an architecture strategy that allows for those pages that currently live deep within the site to be more accessible with just a few clicks away from the home page. Furthermore, cleaning up duplicate content will help to ensure that links are not spread between three different versions of the same piece of content but instead aggregated to strengthen the value of that page.

Of course, best practices like creating XML sitemaps and utilizing your robots.txt file to disallow problematic pages will help engines spend more time crawling all pages. With that said however, in order to be the most effective it might be time to revisit your PageRank.

 Gasp! You Still Need to Care about PageRank

Rachel Freeman

Rachel Freeman works for the Jive Software, the pioneer and leading provider of social business solutions. She has expertise in all aspects of search engine marketing and specializes in SEO and paid search for the B2B sector. Freeman has been responsible for the development and execution of countless search and social marketing campaigns over her years in the search marketing industry.

You Might Also Like

Comments are closed.

15 thoughts on “Gasp! You Still Need to Care about PageRank

  1. Never thought on this aspect that PR can be the factor responsible even for indexing. Nice post and thanks a lot for sharing. We shoudl better focus on deep level link building.

  2. What Google sees for PageRank is almost definitely different than the toolbar PR we get access to. So what it boils down to is:

    Have a sound navigation structure/information architecture
    Build links, thus improve the PR trickling throughout the site

    These are classic and accepted SEO practices. Toolbar PR has a slight but existant correlation with rankings, but will improve relatively proportionately with these changes.

  3. I don't agree with this. A page without PR is not indexed. This is just an analysis but no proven facts. Can you provide us statistics that validates your points?

  4. If this is true, then how about ezinearticles? For sure they do have thousands of pages on their archive. Are they all been indexed? But it does make sense since PR relates to your importance on the web, thus if you have a PR 0 you are less important and the time for crawling your site is less.

  5. I still think looking at PR is a backwards way of doing things. PR is just an indicator of importance (not a very reliable). If you removed the page rank, then this doesn't change the site or page, the resasons behind the PR are still there. It's not the PR that is still important, it's the factors that go towards calculating the PR.
    i.e. It's not PR that helps deeper lever pages get indexed, it's good internal linking. Good internal linking just so happens to also go towards higher PR!

  6. Excellent post and this is something that I've debated on forums for a few years now. It seems everyone is quick to denounce PageRank as a tiny ranking factor when they have no real understanding of it's importance.

  7. The conversation gets confusing when people don't specify whether they mean tool bar PageRank (TBPR) or real/current PageRank. TBPR is not always a pre-requisite for indexation. I have page/s with no TBPR but which are indexed. High PR sites have more PR to flow through the site and that is bound to help get more deep pages indexed in large sites.

  8. I'm not sure if this is correct. Usually newer pages get indexed quickly, even though they have no page rank, yet some old pages, which is likely to have a non-zero page rank are often left out of the index.

    Now if PR is really the deciding factor in the number of pages indexed, then I wonder how the mechanism works. Is it a percentage kind of thing? Like Google will index 10% pages of a site if the site is PR1, 20% if PR2, and so on…. not necessarily in such a linear fashion. That could explain why a number of URLs on my site are not indexed even though Google webmaster tool acknowledges the existence of the pages (from the no. of submitted URLs in sitemap.)

    I wish Google is more forthcoming about how they determine which pages to index.