SEO

Bot Herding: The Ultimate Tool for PageRank Flow

One important rule in SEO is that you cannot rely entirely on search engine spiders and bots when they visit your web site. They [bots] can generate duplicate content issues, perceive important pages as junk or cause other problems which actually don’t exist.

Googlebot has, in the past, mistakenly moved valuable pages into the Google supplemental index or has passed PageRank to pages that do not need to rank  i.e login, register, subscribe, e.t.c.
Therefore it is necessary to guide these bots if you want to avoid such problems. But how can you guide them?

The noindex HTML meta tag advises the search engines’ bots not to index a web page. On the other hand the nofollow meta tag advises the search engine bots not to pass rank through a particular link to the linked page. The nofollow meta tag does the same thing as rel=”nofollow” but at the page level, instead at the per link level. Are the above methods the ultimate PageRank sculpting solution?

Lets look into the following scenario:

Page A is linking to page B, and that link is attributed nofollow status. In this case Page A will not pass PageRank to Page B, right? But what is if page B is linked from another page of your web site or from an external web site without that link being protected with the nofollow attribute? Won’t PageRank be assigned to page B? Won’t a snippet show up in the search results? It will! Is that what you want? Wouldn’t you prefer to make sure that the incoming PageRank for page B is being passed to the most important web pages of your web site? Would it not be better to rank your most lucrative products or services pages? If your answer is yes, continue reading to find out how can you achieve this.

Many webmasters or SEOs might advise to also disallow page B is the robots.txt, using for example this:

User-agent: *
Disallow: /login.php

But will that solve the problem? Not always. As I said above, if Page B is linked from an external site without being protected by the nofollow attribute, PageRank will still be assigned to it, and it can still show up in the search results. But there is a way, in Google, to overcome this problem.

You can use one of the following methods to overcome the problem:

a. Adding to page B the noindex HTML tag like this:

<meta name=”robots” content=”noindex” />

or

b. Adding in your robots.txt the noindex directive using this:

User-agent: Googlebot
Noindex: /login.php

Did I say Noindex robots.txt directive? Yes! Unofficially Google supports the Noindex robots.txt directive, but at this moment it is not supported by other search engines. Using this directive you can block or advise Googlebot not to index a page (or de-index a page if previously indexed). But it will not hinder Googlebot to follow the links on page B and pass the incoming PageRank to the outgoing linked pages which are not protected.

But that is not all!

You SHOULD NOT use the nofollow HTML meta tag on Page B, because that will make Page B a dead end. And links pointing to dead end pages are called dangling links.

Now you might ask: Are Dangling Links a problem?

The answer may be found in the following extract from the original PageRank paper by Google’s founders, Sergey Brin and Lawrence Page:

“Dangling links are simply links that point to any page with no outgoing links. They affect the model because it is not clear where their weight should be distributed, and there are a large number of them. Often these dangling links are simply pages that we have not downloaded yet… Because dangling links do not affect the ranking of any other page directly, we simply remove them from the system until all the PageRanks are calculated. After all the PageRanks are calculated they can be added back in without affecting things significantly.”

That said, you still can add the nofollow attribute on outgoing links of Page B, but you must make sure that at least one link may be followed.

By implementing the above techniques, you can achive the maximum possible control over the PageRank flow within your web site.

Disclaimer: I said maximum possible control, but not 100% control. If you have an alternative  idea and you don’t mind sharing, please  feel free to do so .

John S. Britsios (aka Webnauts) is the Founder & CEO at SeoWorkers.com & Webnauts.net, a Web Architect & Senior SEO Consultant, specializing in Web Content Accessibility, Usability Testing, Search Engine / Social Media Optimization & Marketing.

 

 Bot Herding: The Ultimate Tool for PageRank Flow
Founder and Chief Information Officer (CIO) of SEO Workers and Chief Executive Officer (CEO) of Webnauts Net, a qualified Forensic SEO & Social Semantic Web Consultant, specializing in Semantic, Forensic & Technical Predictive Search Engine Optimization, Content Marketing, Web Content Accessibility, Usability Testing, Social Semantic Web based Responsive Web Design & Ecommerce Development, UX & Funnel Optimization, Conversion Rate Optimization.
 Bot Herding: The Ultimate Tool for PageRank Flow

You Might Also Like

Comments are closed.

77 thoughts on “Bot Herding: The Ultimate Tool for PageRank Flow

  1. That was a great post. Have not really heard anyone talk about this but it is quite interesting. It is interesting to read about how one page can pass on pagerank to another and the whole dangling links thing. It really does seem like they are a bad thing.

  2. Jaan its actually about showing Google what pages I feel are important and which not. And if not important pages have IBLs carrying PR, that I can pass that PR to my important pages. If we should call that PageRank Siloing, them I am more fine with that.

  3. Yet really interesting thoughts. It makes a lot of sense to take care of your internal website “juices” once you have hundreds or thousands of content pages…

  4. What can I say? I know your indepth research and testing is mind boggling to me, but always the perfectionist – always right on!

    Koodo’s to you and as always, thanks for sharing.

  5. John: “Michael if you read the article and you don’t agree with it, I should assume that you believe that pages not indexed in Googles Index can have PR”

    Michael: Yes, they can have PR. So Google employees have said through the years.

    The best way to control the flow of PageRank through a site is to place more links to the most important destinations throughout the site.

  6. Michael if that is true, then all the HTML files on my personal PC hard drive must have PR, but ok… maybe because it is just because it is connected to the Internet. :)

    Or if I have pages within a password protected area they have PR too.

    To resume, pages Google does not know or are not indexed have PR. Is that what you mean?

    And by the way can you explain what is the difference between the grey and white PR bar?

  7. Google indexes URLs that it hasn’t crawled. That’s a well-published fact. The core PageRank algorithm is not concerned with whether a page has asked that it not be indexed, but rather with only whether there are links pointing to a page.

    As for the difference between the grey and the white PR bars, it’s a color difference.

    I, personally, could not care less about Toolbar PR.

  8. I feel like jumping into the discussion between Michael & John.

    Where ‘sculpting’ pagerank is concerned- these tactics are more of a directive so that if people link to your login page that link juice is passable onwards and not wasted. If you have a page that is not indexed but still has PR, you can funnel that juice out to your indexed and focused pages.

    Cheers,

    Derek

  9. I’ve talked about this issue so much with john and others over the last year that im not even sure of my opinion any more!?

    One thing i will say is that for the average SEO or webmaster simply building your internal linking in the right way (Michael – “The best way to control the flow of PageRank through a site is to place more links to the most important destinations throughout the site.”) will result in the right pages going into the main index.

    I know john will now reply with, “why carry on sending pagerank to the supplemental pages?” but the simple answer is… does it really make any difference if you only push pr to chosen pages? Where are the traffic and rankings figures before and after this has been done?

    I’ve done basic sculpting on small sites and it’s made no difference but i can see this being more of an issue on huge sites.

  10. Michael: Google indexes URLs that it hasn’t crawled. That’s a well-published fact. The core PageRank algorithm is not concerned with whether a page has asked that it not be indexed, but rather with only whether there are links pointing to a page.

    John: I agree. But what happens with the PR when links are pointing to a page which may not be indexed, but still have outgoing links?

    Michael: As for the difference between the grey and the white PR bars, it’s a color difference. I, personally, could not care less about Toolbar PR.

    John: I personally could not care more less about the Toolbar PR. I am still believe that you know why I asked. So why don’t you answer that?

    Derek: Where ’sculpting’ pagerank is concerned- these tactics are more of a directive so that if people link to your login page that link juice is passable onwards and not wasted.

    John: Exactly Derek. I guess I used the wrong term in my article. I probably had to call that PageRank Directive Technique, and not PageRank Sculpting. :)

    Matt: I know john will now reply with, “why carry on sending pagerank to the supplemental pages?” but the simple answer is… does it really make any difference if you only push pr to chosen pages?

    John: No Matt. I will not ask you “why carry on sending pagerank to the supplemental pages?”. But I would like to ask you, for what purpose do you use the “nofollow” attribute? Or do you only use it for paid links and/or comment spam?

  11. When it comes to clients sites i dont play around with nofollow too much. When i do, i use it for dodgy external links and links to pages i really dont want in the main index. For instance i’ve got a client who has two sides to his site, commercial and residential. There’s some serious duplication and keyword cannabilisation between the two so i’ve blocked the commercial side and the links to it with nofollow. That area of the site is now completely invisible and it was the easiest way to solve the problems without fully restructuring and re-writing.

    On my blog i’ve nofollowed NEARLY all the category links (planning on adding unique content to each category page soon so i dont want them fully blocked) and archive links. I’ve also nofollowed the outbound links on my home page to try and push the incoming pr and SEs to my posts and pages.

    And comment links/social links/rss links/admin links/readmore links of course! It makes sense (in my opinion) to remove any links which arent to rankable pages or dont have keyword rich anchor text. Therefore removing any link which wont directly benefit rankings.

  12. But i think it’s worth adding that a lot of what im doing on my personal site is testing and experimentation and it’s also worth pointing out that a lot of what can be achieved with nofollow can be achieved by better interlinking.

    While you’re “herding” the bots around the field im sat on the fence!

  13. Matt: There’s some serious duplication and keyword cannabilisation between the two so i’ve blocked the commercial side and the links to it with nofollow. That area of the site is now completely invisible and it was the easiest way to solve the problems without fully restructuring and re-writing.

    John: If I discover as a visitor that area of the site will it completely invisible to me? If not, how would you like it if I would link back to pages in that area? And why shouldn’t the PR pass over to the visible pages, if there are already links to them on those invisible as you call pages. Or am I missing something?

    Matt: I’ve also nofollowed the outbound links on my home page to try and push the incoming pr and SEs to my posts and pages.

    John: To avoid any misunderstandings, do you nofollow outbound links to external sites on your homepage trying to push the incoming PR to your posts and pages?

    Matt: And comment links/social links/rss links/admin links/readmore links of course! It makes sense (in my opinion) to remove any links which arent to rankable pages or dont have keyword rich anchor text.

    John: I prefer redirecting the links to an intermediate page that is blocked from search engines with a robots.txt file. See option 2 here: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=66736

  14. Great Matt!

    And I hope you remember that I posted that link in forums endless of times.

    Now two tips:

    - You cannot block the redirect file its self. If you have redirect file called i.e redirect.asp, that cannot be blocked with the robots.txt. But I already explained that in different forums how that can work.

    - Make the redirects 301 and not 302. Otherwise I will not be surprised if Google will perceive that as a content theft attempt.

  15. This is great free advice and a very healthy conversation (my head is spinning) that gurus charge beaucoup de money to teach. Thank you for that. I am not sure if you all came to a final conclusion / agreement.

    Personally, I experienced with the nofollow links on a brand new site and had the reverse effect so far. My contact us and about us page were indexed first.

    I assume this is temporary and due to the fact that my home page is not pointing to any other page (dangling links) . We’ll see…

  16. Bot Herding! I love this term :D

    I hadn’t actually thought of stopping Google ranking some of my pages but I’ll give this a go.

    Our clients area and various contact pages will be where I start first.

    Thanks for posting this :)

  17. I like this posting and will try figure out the codes that you have mentioned above. whatever the dispute in this comment room, i still thank you for bringing this out.

  18. Is it possible that you could give us a sample of how the noindex robot is done with page A and B.
    If you have http://www.xyz.com being advertised for an ad, but need it to be a no follow, but the advertising company cannot do a no follow, how can you overcome this? Do I create an intermediate page, like http://www.xyz.com/newpage, and have this new page with the no index file on it? Appreciate your help. thanks.

  19. i want get google page rank and i find your blog for i need information,,,
    thanks a lot
    .-= Ali´s last blog .. =-.

  20. I just came across this post – this is a little above my head but I am interested in knowing how to increase the PR of my site and I am not an SEO expert. Is there anywhere that you could point me?
    Thanks in advance

  21. I think that this is more complex for fresher to understand properly. If any body need more he is advised contact with SEO consultant.

  22. I have an Alexa of 400k and a PR of 0. Google answered my question on why I have a PR of 0, they said that they don’t like .biz domains, although I support a .biz with a PR of 4.
    The one thing that bothers me is the attempt to create a PR. This is exactly what Google does not want people to do. They have been clear that an entire site should be either follow or nofollow and that any attempt to choose how the Googlebot sees your site will be held against you.
    I hope your sculpting method works for you, I choose to not do it.

  23. @ McLaughlin, Alexa is not an accurate tool for measuring the success of a web site and Google has no problem with .biz domains.

    Also no one is talking about creating PR as it is not possible. The topic is about flowing the PR to your important pages which you want to rank. And Google and no other search engine has a problem if you will tell them not to index pages of your site. If that was true, we would not be allowed to use meta robots tags and robots.txt.

    Google only stated that the nofollow attribute does not work anymore for internal links and attempting PageRank Sculpting. So if you still continue using it will not make any difference. The juice will pass. And from my point of view I am glad they did that, since the nofollow thingy was about fighting spam and not sculpting PR.

    I agree that was my mistake that I used the word sculpting, because the technique I mentioned here is about PR flow and not sculpting. Read my previous comments for further understanding.

    Therefore you seem to be seriously misinformed in many SEO concepts and you definetely did not understand the concept I wrote here.

  24. I think as SEO companies we should focus more on ranking than PR sculpting. As long as the pages are indexed.

    What tool do you use to measure success of a website? Not Alexa. Surely not PR right?

  25. Page Rank is dead.

    According to Google, it just don’t matter as much. There are just to many other factors now.

    If I were you I would be worrying about your twitter followers not page rank.

  26. @Surf Ads, if you understood my article and my previous comments here, you would have realized that it is not about PageRank Sculpting. It is about PageRank Flow. I am sorry that the title of the article have mislead you. And what I wrote works.

  27. @John S. Britsios

    Bruce says page rank is for show.

    SEO focuses on your content, rank is something to show your client as a metric, he also explains the visual need, for SEO clients, the first 6 minutes.

    Jump to 5:45 to hear him says it’s a visual tool now more then anything else.

  28. @John S. Britsios,

    You are getting most of the traffic to this page from social bookmarks.

    Without [Book Marketing] how much rank or traffic would it get from Page Rank Sculpting or Flow?

    When you connect to thousands of people on twitter, you are connecting to their Twitter page rank which increases your Twitter page rank and the Page Rank of the link you specify in your settings.

  29. @John S. Britsios, How is it off topic? SEO is the topic, Why do you have comments on your page if you can’t appreciate a good text sparring of what really matters.

    My point is that maybe they should focus on their Twitter followers probably more then Google SEO. Don’t worry to much about what Google wants from you. It’s not up to them.

    Twitter sends me more traffic then Google, Yahoo, and MSN combined. Most of the traffic coming from Google is regular people that have already found my site, but just can’t remember the full domain name.

    (People actually use their search box as a domain browser box)

  30. Hi John,

    This topic was rather vague to me before. I thought that web pages just have to have their own SEO and never really understood how the rest of a website can contribute to the most important page.

    I have a new insight to work on. Thank you for sharing your knowledge.

  31. Learned more from the comments than the actual article but it doesn’t mean the article isn’t helpful. If it wasn’t written then there wouldn’t be any informative comments made. TY

  32. Interesting stuff.
    I always take pagerank into account when link building.
    Get most inbound links to my index, quite a lot to the categories, and a few to each ‘leaf nodes’ ie pages.

    I don’t use nofollow though, perhaps I should.

  33. What I am wondering after reading these many comments is” will Google penalize you for sculpting or channeling what the spiders see?” I guess there is no real way for them to know…right? Unless you conduct some sort of before and after analysis. I have been coming across many stories of Google penalizing people recently so that has me a little on edge.

    1. @Kingpin: I have changed the title of the tutorial as it was misleading, since tt is not about sculpting. It is about PageRank channeling or low. And that has absolutely nothing to do with manipulation or anything similar to that. It is what we call PageRank flow or siloing.

  34. What is the best solution. I read your post, but can't know it clearly.
    If I want a page not to be indexed, not get any PR and don't effect other pages, how can I do?

  35. What is the best solution. I read your post, but can't know it clearly.
    If I want a page not to be indexed, not get any PR and don't effect other pages, how can I do?

  36. What about the meta does not work. because I did not get backlinkgoogle at all but already there are meta google.

    good luck on my blog and thank for share.