I have already looked into ways to increase crawl rate for those who suspect their site is worth more frequent bot visits. However from time to time I come across forum threads where webmasters complain that (Google) bot seems too active at their pages eating too much bandwitch.
So here I am listing links to those complaints and possible solutions that might help to handle the situation:
- Verify the Googlebot – make sure that’s really Google that is too interested in your site.
- Change the Googlebot’s crawl rate via Google Webmaster Tools (while “higher rate” option is available only to few lucky people, “slower rate” can be set by anyone);
- Make sure the Googlebot is not crawling any extra URLs – i.e. duplicate URLs, URLs containing session IDs or other (multiple) parameters not necessary for indexing:
- Additive filtering of a set of items;
- Sorting parameters; etc.
- Make sure your pages don’t have too many unique URLs that keep the bot too busy: the most popular and widely-known standard is no more than 100 links per page (however that’s not a must of course, Google can cope with more – just keep the number reasonable);
- Make sure the Googlebot is not busy crawling huge images or PDFs – i.e. mind the size of your files and your page overall load time.