Is the White House blocking search engine indexing of government web pages which feature information on the Iraq War?
2600 reports “WHITE HOUSE’S SEARCH ENGINE PRACTICES CAUSE CONCERN”
As the war in Iraq continues, is the White House intentionally preventing search engines from preserving a record of its statements on the conflict? Or, did their staff simply make a technical mistake?
When search engines “spider” the web in search of documents for their indices, web site owners sometimes put a file called robots.txt which instructs the “spiders” not to index certain files. This can be for policy reasons, if an author does not want his or her pages to appear in search listings, or it can be for technical reasons, for example if a web site is dynamically generated and can not or should not be downloaded in its entirety.
According to reports, though, the White House is requesting that search engines not index certain pages related to Iraq. In addition to stopping searches, this prevents archives like Google’s cache and the Internet Archive from storing copies of pages that may later change. 2600 called the White House to investigate the matter.
According to White House spokesman Jimmy Orr, the blocking of search engines is not an attempt to ensure future revisions will remain undetected. Rather, he explained, they “have an Iraq section [of the website] with a different template than the main site.” Thus, for example, a press release on a meeting between President Bush and “Special Envoy” Bremer is available in the Iraq template (blocked from being indexed by search engines) or the normal White House template (available for indexing by search engines). The attempt, Mr. Orr said, was that when people search, they should not get multiple copies of the same information. Most of the “suspicious” entries in the robots.txt file do, indeed, appear to have only this effect.