Search Engine Marketing

Craigslist Blocks Search Engine Spiders and Listings

Loren Baker

01/3/06

17 Comments

Craigslist Blocks Search Engine Spiders and Listings

According to a thread at SERoundtable Forums, Craigslist has blocked the spidering and indexing of its classifieds sites from search engine robots, which scan web sites and save the site information in the memories of search engines. That site info is later delivered to the end user when searching on Google, Yahoo, Ask, MSN or other engines.

As a result of the blocking of search engine spiders or bots, Craigslist pages are now not showing in search engine results. A member of the SER forum writes “So the answer is clear. Craig blocked the bulk of his content from being crawled. A query in Google or Yahoo for an item in Craig’s “jobs” or “for sale” section will confirm that his content has been removed entirely. To my knowledge, this is the largest deindexing ever. Tens of million pages vanished.

So basically, pages within Craigslist are not listed in search engine results. Why would they want to do such a bulk delisting when webmasters all over the globe are scrounging to get such valuable search traffic? It could be an effort to deter spammers from listing bogus links in Craigslist in an effort to have those links followed by search engines. The delisting may also be an attempt to stop scraping of the sites content via bots which access cache pages of Craigslist from search engines, then reprint the content illegally on other sites.

Remember back last year when Craigslist blocked aggregation sites and niche engines like Oodle from indexing their classified ads? This recent block of the major search engine bots may be an extended effort to establish Craigslist as more of a destination, and not content fuel for Yahoo, Google (& Google Base), Ask, & MSN local efforts.

17 Comments

  • Sean says:

    Another possibility is that the search engines may have been serving up old listings and thus delivering hordes of users to ‘post expired’ pages. I’m not sure how often the SEs reindexed Craigslist but it could be a possibility. And speaking of reindexing, lets say Googlebot and other crawlers realize that Craigslist adds thousands of pages a day and thus crawls the site almost non-stop. With millions of pages that could translate into a massive server and bandwidth load.

    Ultimately it seems like an odd move. I can’t see the benefits outweighing the negatives here.

    –Sean

  • William says:

    Craigslist has taken the option of not allowing content from their site to be used to add value to competitors sites ie Google,Yahoo etc. They believe that their content, value proposition and brand are unique enough that users will come to them for the information. Maybe others will begin to also look at content that they have created or that they have added value to this way. Should other companies have the ability to grow rich on content that they have indexed from you ? Does the company that indexed the content add value to your content ? What side of the equation is the content creator on in most indexing situations? Its a choice that Craigslist can make. Even if I did not agree with them I would applaud them for going against the crowd.

  • David Utter says:

    SEW has posted an update to the story. Craigslist isn’t blocking crawlers, the robots.txt file they use apparently makes crawling less demanding on Craigslist servers.

  • Loren says:

    David, thanks for piping up with the correction to this story.

  • Steve says:

    I can understand wanting to ‘unplug.’ Cyberspace is certainly red hot. It just seems to me that the business of Craigslist is to connect people, and seeing that search engines facilitate and accelerate that purpose, (if I were Craig) I would have found a way to make the search engines work for me.

  • Dave says:

    Even without a massive influx of search engines to Craigslist, its still a weird place on its own. Just take a look at its personal ads which are some its most popular postings. I believe there even writing a book that will include some of the top postings from the site.

  • Craig says:

    I understand craigslist not wanting the ads being indexed as they expire in 7 days and are not useful search engine data.

    I put together a craigslist search tool that emails you whenever something you are searching for is posted on craigslist. You can see it at: http://craigslist.craig2mail.com/

  • Warwell says:

    Uhhh…search google for “Skate Board Printing”.
    I’m number one and 2 with my CL ad. Is this info out of date?

    Hit me up, I’ll show you what i did.

    http://www.warwell.com/blog

  • Mike says:

    This Craigslist Searcher search constantly for you

    Enter your search only once

    http://www.CraigsPal.com

  • There is a combination of many valid reasons above including the abuse of people using it for spam purposes.
    http://www.erectile-dysfunction-treatments.org

  • Craig says:

    You can search craigslist, kijiji and oodle on multiple cities at once now with http://www.craigslist-search.com

  • Search says:

    Here is a site I found recently with many free craigslist addon tools. http://www.craigslistcompanion.com

  • ShortGig says:

    If you are looking for a better site to Post and Search for Gigs, check out ShortGig, it rocks!

  • motiont says:

    Want to search multiple locations and categories in craigslist? Want to be notified as soon as the item meeting your criteria pops up on craigslist?

    visit http://www.motiont.com/craigslistreader.aspx

  • Chris says:

    I got involved as an expert for craiglist. I got dozens of links quickly. Each time i answered a question i got a link. However if you let your questions go over night then you have to work really hard to answer them. When it all get too much you are removed as an expert.

  • Fried Pigs says:

    Purple peanut squirrels miss yellow piggy goes poop. Why are gyroscopes within the tangible atmosphere. Pews are not unless we can seek the duckling forever.

  • Ian says:

    I wrote a Craigslist search tool available at http://www.NotifyWire.com.

    You can save searches to run again later and even monitor Craigslist for new results and get alerted by SMS and email.

    Because my software is based on grid computing you don’t need to leave your computer running all do to monitor Craigslist. At the same time, this allows us to check Craigslist about every 5 minutes without one machine using more than 200 hits against Craigslist per day.

    I use it to monitor for new telecommuting tech gigs across the country and I’m almost always the first person to respond to the gigs I want.

    Check it out and let me know what you think.

Leave a Reply