This week, Microsoft announced updates and improvements to the Live Search crawler, which they say will “improve the efficiency” with which they crawl and index websites. Amongst the improvements to MSNbot are HTTP compression and something called “conditional get.”
With HTTP compression, MSNbot now supports gzip and deflate as defined by RFC 2616 (Sections 14.11 and 14.39). This allows for faster transmission time by compressing files and application responses, which in turn reduces the network load on both Microsoft’s end and yours. Additionally, Microsoft has provided a tool that will allow you to check your web server’s compression support.
MSNbot now also supports “conditional GET” as defined by RFC 2616 (Section 14.25). This means that the new msnbot/1.1 will not fetch pages that haven’t changed since the last request, provided that the web server supports the “If-Modified-Since” header in conditional GET requests.
The support of HTTP compression and conditional GET are both efforts to reduce the amount of bandwidth that the MSNbot will use in crawling your site. While most appreciate being indexed and crawled, large sites can take a serious slamming from the various search engine bots constantly hitting their sites. If by chance your web server is not configured for HTTP compression and conditional GETs, it would be well worth your time to perhaps suggest it to your web host.
Another small change that webmasters may notice is reflected in your server logs. Where the Live Search bot has frequently appeared in these files as “msnbot/1.0+http://search.msn.com/msnbot.htm)”, “msnbot-media/1.o”, “msnbot-products/1.0”, and “msnbot-news/1.0”, it will now appear as “msnbot/1.1”. At this time, the update applies only to the main MSNbot crawler. The other ‘msnbot-*’ crawlers are expected to be updated in the near future as well.
Microsoft is directing anyone who is experiencing issues with MSNbot, or those who have questions about the updates to their Crawler Feedback & Discussion form.