News

Yahoo Blocking Bots from Spidering Delicious Bookmarks

Over the weekend, Yahoo’s Delicious (del.icio.us) social bookmarking property has been blocking spiders and bots from non-Yahoo search engines from crawling the site and identifying new web pages, sites and bookmarks.

Colin Cochrane found this out the other day, saying that ‘This isn’t a simple robots.txt exclusion, but rather a 404 response that is now being served based on the requesting User-Agent.’

I took a look at del.icio.us’ robots.txt and found that it was disallowing Googlebot, Slurp, Teoma, and msnbot for the following:

Disallow: /inbox
Disallow: /subscriptions
Disallow: /network
Disallow: /search
Disallow: /post
Disallow: /login
Disallow: /rss

Seeing that the robots.txt was blocking these search engine spiders, I tried accessing del.icio.us with my User-Agent switcher set to each of the disallowed User-Agents and received the same 404 response for each one.

Colin also found that Delicious pages listed in Google are lacking a cache, title, description and other information.

Why would Yahoo do this?

Yahoo has a competitive advantage over Google, MSN and Ask.com by being able to identify web pages and other content via human bookmarking on Delicious before search engine bots can. Yahoo can also classify web documents via human descriptions and tagging, lending external meta data to these documents which can result in more relevant web results and intent targeted rankings.

Since Yahoo has integrated Delicious into its search results and it is quite evident that Delicious has a very important role in Yahoo Search, Yahoo is taking full advantage of its property by blocking its competition from crawling such information.

It’s a bold move by Yahoo only for the fact that Delicious is user powered, and dependent on a community of users. On the other hand, blocking your internal secrets from your competition is a basic business practice, and Yahoo has essentially set up a security fence to keep Google, Ask.com and MSN from snooping around its back yard.

None of their competitors have anything that can compare to Delicious. Google made a very large mistake by not buying StumbleUpon for this very same reason.

Bold move by Yahoo, but competitively the correct move. Your thoughts?

[Additional discussion on Sphinn]

Screen Shot 2014 04 15 at 7.21.12 AM Yahoo Blocking Bots from Spidering Delicious Bookmarks
Loren Baker is the Founder of SEJ, an Advisor at Alpha Brand Media and runs Foundation Digital, a digital marketing strategy & development agency.
Screen Shot 2014 04 15 at 7.21.12 AM Yahoo Blocking Bots from Spidering Delicious Bookmarks

You Might Also Like

Comments are closed.

9 thoughts on “Yahoo Blocking Bots from Spidering Delicious Bookmarks

  1. Loren, neither the bookmarks nor the profiles linking to the sites are blocked. The bookmarks are all under the /url/ directory, and the profiles are all under root. The only things blocked are the unimportant things that del.icio.us always blocked.

    Look at the cache of robots.txt, and you can see that it hasn’t changed since Dec 24th.

    He misinterpreted what he saw.

  2. Pratheep, everyone is citing Colin’s article though. If the initial source is incorrect in their assumptions (which is what I am saying I think to be the case), then it doesn’t matter how many people repeat them… they would all be wrong.

    I’m neither the first nor the only person to point out that they are probably checking by IP, and only blocking that user agent from invalid networks.

  3. Thank you Michael.

    Ugh! I’m tired of misinformation being spread by self proclaimed “SEO Analysts”. This is just plain false.

    Michael is right. The only pages being blocked are admin type pages and spoofers