SEO

Google PageRank & Play-Doh

Cannibalized page rank – one of the least recognized mistakes in online marketing.

If I lost you at page rank visit these sites for a proper definition.

In a nutshell, page rank (PR) is the method used by Google to rank web pages. This method “democratically” weighs links to your site’s pages from other pages on the amazing world wide web. If you have the Google Tool Bar…

google tool bar Google PageRank & Play Doh

…you will notice the “Page Rank” area, which displays a green and a white sliding bar. This lets you know the rank of the page you are on at that moment. Google obviously has a PR of 10 and the Official Andrew Dice Clay site has a PR of 2. Clearly, Google is doing something right!

Unfortunately, the tool bar page rank doesn’t display the actual live rank of a page, in fact it might be several months old because Google only updates that once every couple of months. Before you jump on the “Google is evil” bandwagon, take a deep breath, it’s okay. Page rank is not what a good marketer should be measuring on a daily or even monthly basis. There are literally dozens of other factors more worthy of your valuable time such as domain age, page titles and as I noted earlier, cannibalistic URLs.

So, what am I talking about?

When you build a web site, you create paths to certain pages. Most web developers will put those pages in specific folders like: www.example.com/press-releases/ or www.example.com/store/ to give the site a logical structure. Unfortunately, depending on a slew of technical stuff like servers, file extensions, redirects and internal site links, that pretty path can end up looking like a number of unique paths to both users and search engines even though they’re really the same:

www.example.com/press-releases/
example.com/press-releases/

www.example.com/press-releases/default.aspx
example.com/press-releases/default.aspx
www.example.com/press-releases/?id=1
example.com/press-releases/?id=1

Why is this a problem?

Isn’t this just ugly code to some standards-compliant web freaks?

It’s a problem because when a link goes to that page (either from your site or another site) and it uses different paths (through a mistake or technical error), that path is seen by the search engines as unique pages. And when Google determines page rank from links to your page, if they find multiple pages, you could be splitting your best possible ranking.

Let’s use Play-Doh to demonstrate the principle.

If you have three page paths:

example.com/default.aspx
example.com/default.aspx?id=1
WWW.Example.com/Default.aspx

The first one might have just one or two links to it and a PR of 1. The second might have a few more links and a PR of 2. And let’s pretend the third has even more links with a PR of 3.

page rank Google PageRank & Play Doh
Unfortunately, page rank isn’t simple arithmetic, but for the sake of this discussion, if you could make all of those links go to the “same” page, you would be channeling greater link equity to one central location. This could potentially result in a PR of a big, beautiful 6, which should mean increased rankings:

page rank 6 Google PageRank & Play Doh


How do you find out if you’re cannibalizing your page rank?

First things first… open your website in a browser and type in the domain as www.example.com. Then type in example.com without the “www.” If the domains stay the same when you type either one, you need to designate one version over the other. For detailed instructions on how to do this, read Chris Hooley’s article, Canonicalize with .htacces. Your goal is to make one version redirect to the other.

Now that, that biggie is out of the way, you need to choose how you want your pages to look. I recommend removing extensions entirely if you are using folders (e.g. www.example.com/example/ versus www.example.com/example/index.htm). This is for a maintenance reason, but I am also partial to keeping pages in the root directory as much as possible, which means you have to show an extension like www.example.com/example.htm.

Whatever you choose to do, make sure you stay consistent in how you code your internal links. What I mean is that if you create a path on your site to www.example.com/example/, do not make the page www.example/example in another area of your site. This is one of the few times in life when it’s okay to play favorites!

You should also control the amount of variables appended to your path. Often, for tracking or programming reasons, variables are appended to URLs that can make your paths appear different. Try to limit them and again, be consistent.

To test for variables and path mistakes, create a Google Webmaster Central account and navigate to Webmaster Tools. Then go to the Links tab (after you have verified the site) and scroll through both the internal and external links. You should be able to easily eye serious issues. I also like to use Xenu Link Sleuth, which detects broken links, but also displays a list of paths on your site.

And, that’s about it, though there’s probably a lot I did not cover either from my own misunderstanding or tiredness. Either way, I got to play with Play-Doh and talk about page rank. It doesn’t get much better than that!

 Google PageRank & Play Doh
Rhea Drysdale is Co-founder and CEO of Outspoken Media, which specializes in SEO consulting, link building, reputation management and social media. With more than seven years experiences, Rhea has spoken at SMX, SES, Web 2.0 Expo, Pubcon, Blog World Expo and BlueGlass. She has also been featured on CNN.com, in the Wall Street Journal and in SEO: The Search Engine Optimization Bible as an industry insider.

Comments are closed.

60 thoughts on “Google PageRank & Play-Doh

  1. Michael,

    Actually, it is, but PageRank is only a portion of how it works. I wrote an article about PageRank just today which has a direct quote from Adam Lasnik. You may be interested in reading up on it to see how Google Search actually works.

  2. J – Thanks for linking to your article and nice post.

    M – This is a really basic post for people that are less than beginner SEOs. I could have gone really technical, but I was playing with Play-Doh and it’s difficult to get THAT serious. Thanks for being passionate. :)

  3. M – With a little more though, I think meaning was lost in translation. I wasn’t saying PR was used to rank pages on the SERPs, simply as a way to rank A page. Let me know if I need to clarify that in the post or you were talking about something else.

  4. Well I agree Pr is what google uses to rank websites, but it is not the only thing they use. I would say pagerank is responsible for 15-30% of the ranking algorithm

  5. There is definitely something wrong with page rank. We launched a new site (www.catholicchurchsupply.com) three days ago and Google says it has a page rank of nine.

    I can only dream.

  6. Hi Rhea,

    Funny that this comes up again everywhere.
    Those things came up during the pretty big discussion at this SEJ post in March about a releated subject, which then took a turn and end up where you were writing about.

    As a result of it, did I publish a bunch of classic ASP code and documentation afterwards, to help with exactly those issues on MS IIS, where it is a bit more tricky, especially, if you can’t install 3rd party ISAPI filters on the web server.

    The rewrite stuff that is coming with .NET automatically is also not perfect. The provided classic ASP code could be easily be ported to .NET as well.

    See you in a bit at SES. Cheers!

  7. PR as google uses it is updated frequently. PR as what we can normally see, isn’t updated very often – potentially up to months in between updates.

  8. Rhea, I liked this illustration. You could do a similar post with a bunch of Play-Doh and show how you have a certain amount of Play-Doh (your PageRank), and you choose with your internal linking how to spread that Play-Doh throughout your site. If a given page has enough PageRank (reasonable-sized ball of Play-Doh), it can be in our main web index. If it has not-very-much PageRank (tiny ball of Play-Doh), it might be a supplemental result. And if only a miniscule iota of Play-Doh makes it to a page, then we might not get a chance to crawl that page.

  9. Jonathan wrote:
    “Actually, it is, but PageRank is only a portion of how it works. I wrote an article about PageRank just today which has a direct quote from Adam Lasnik. You may be interested in reading up on it to see how Google Search actually works.”

    Jonathan; I don’t think you know “who” you directed your comment to. Michael Martinez is one of the “smartest” and most respected in this industry. I feel you need to do some reading. Rheah did write a good but “basic” article. I do feel she could have been much more specific in her wording as she says that new people are reading, but those new people should not be given blury information. The pagerank stuff is blury for new people.

    And Michael is dead right; pagerank is NOT something Google uses to “rank” pages at all. Trust me on that.

  10. “And Michael is dead right; pagerank is NOT something Google uses to “rank” pages at all. Trust me on that.”

    It would be great to hear Matt or Adam verify that here.

  11. In a nutshell, Google uses importance and relevance to rank pages, though as Adam said, there are many different ways Google can sort results (for example, displaying a varied sample on the front page e.g. movie review, movie rental, movie details – that ranking method has nothing to do with either relevance or importance).

    The “abundance problem” (i.e. for popular queries, there are simply too many relevant pages) requires that importance, however its measured, is part of Google’s ranking algorithms.

    According to a 2004 patent called Adaptive computation of ranking, ranking documents using on-page content and anchor text is computationally more expensive than analyzing link structure and that relevant-only ranking method “fails to assign the highest ranks to the most important documents.”

    And though TrustRank paper isn’t a Google concoction, it does state that pages with higher PageRank tend to be displayed higher in search results, if all pages are equally relevant.

    Why is ranking URLs based on high authority (aka PageRank) not a good idea? Because, according to the Hilltop paper:

    “Since PageRank is query-independent it cannot by itself distinguish between pages that are authoritative in general and pages that are authoritative on the query topic. In particular a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query.”

    But its clear today we still see authoritative sites rank high for off-topic terms (e.g. v7n ranking for “britney spears naked.”).

    Which is more powerful: relevance or importance?

    Read my post “free seo course offering expert training” to find out; and then check the SERP for the results.

  12. I’m a bit behind on my feeds… :P

    Best use of play-doh I’ve ever seen!

    Just one thing… I think you should make it clear that the redirects should be made using 301-permanent redirects.

    “First things first… open your website in a browser and type in the domain as http://www.example.com. Then type in example.com without the “www.” If the domains stay the same when you type either one, you need to designate one version over the other.”

    …if one does redirect, you’re not in the clear because it could be a 302 or some other type of redirect.

  13. When I read this post, no-www.org came to mind right away :)

    Sure, the news might not get updated too often overthere, but the concept is still a good one.

  14. Basically the whole theory behind PageRank is that at some point it was regularly updated. Now it is regularly neglected yet still respected. You can have a PHD in Microbiology or be like me who skipped college to become an SEO. In the end PageRank doesn’t matter it is rankings that matter. You can have a PR 7 website and rank 30th for say “Play Doh” or you can have a PR4 site and rank #1. In the end it is all about traffic and sales and if your site doesn’t produce PageRank will not help you.

    So really, PageRank is good at letting you know what pages to improve. But it doesn’t guarantee that your going to have good rankings. It is a false hope that many people still rely on…

    It seems people here are mixing up PageRank with Google’s algorithm. PageRank gives you a pretty number telling you on what level how valuable your site is to the internet. The algorithm that Google uses is the piece of code that actually ranks your site #1…#2…#3 etc..

  15. ankara evden eve nakliyat Taşımacılık ankara evden eve nakliyat ankara,
    firmaları ankara içi nakliyat, ankara nakliye, nakliye ankara, evden eve nakliye

  16. ankara evden eve nakliyat Taşımacılık ankara evden eve nakliyat ankara,
    firmaları ankara içi nakliyat, ankara nakliye, nakliye ankara, evden eve nakliye