SEO

Research Firm Says Yahoo, MSN, & Google Have Major Page-Caching Flaw

A “significant” vulnerability has been found in the page-caching technologies of the three major search engines – Google, Yahoo, & Microsoft Live Search. Researchers at Aladdin Knowledge Systems discovered the flaw, which allows the search engines to deliver malicious pages that have already been removed from the web. The discovery was made when the researchers were analyzing the content of a hacked university website, which had been cleaned up. However, the malicious content was still accessible through the search engines’ cached pages.

To take advantage of this flaw, Aladdin suggestions that an attack could create multiple malicious web pages at various hosting services, do some promotion of them into the search engines, and then take the pages offline so it appears as if there is no threat. Then, a series a links amongst multiple websites could be used for a cross-site scripting attack.

When contacted about the flaws, Microsoft said that they were not aware of any negative customer impact, but are investigating the issue. A Yahoo spokesman promised a quick response to the threats, and Google has so far not issued any comments.

A cached page is essentially a snapshot of that page as it was at a particular time, and as things stand now, until the search engine bots crawl the particular page again, it’s going to appear as it was the last time they crawled it. If the last time they crawled there was malicious code or links on the pages, that’s how it is going to appear in the cache. The same holds true for links to illegally distributed proprietary software – if it was there when the site was crawled, it’ll likely still be in the cache and people will still be able to find it. It seems what Aladdin is hoping the search engines will do is devise some way for their bots to identify the malicious code and skip over indexing any pages including it, which wouldn’t actually be a bad idea.

You Might Also Like

Comments are closed.

4 thoughts on “Research Firm Says Yahoo, MSN, & Google Have Major Page-Caching Flaw

  1. This would be a bigger problem with sites like archive.org as they keep old pages forever by design.

    With search engines the old pages are eventually dropped, some in a matter od days to weeks.

    Be aware that Google can keep some old pages for a year when they are in the Supplemental Index.

  2. Oh yes, removed pages can be found in the cache of some search engines for eternity.

    I’m not a big fan of the “cache” functionality, since it has sometimes be exploited by scrapers to steal content from websites. “noarchive” is a good idea.