On August 25th 2004 the comp.risks forum run an article I submitted regarding the large number of Microsoft Word documents available on US milatary sites (sites in the .mil domain) through Google searches (23.50 "U.S. military sites offer a quarter million Microsoft Word documents"). The article documented how such documents could lead to the leakage of confidential data. A week later I setup a script to watch the number of Word documents available through Google searches to see if and when the military would recognise the threat those documents posed and remove them.
According to the data I gathered the number of .mil Word documents returned by Google peaked at 1,180,000 on September 20th 2005, and then started gradually declining. Currently there are 941,000 documents online. No such decline was visible on other domains I monitored, so the change is probably not an artefact of Google's collection or query mechanisms, but an organized move by the US military. The following charts illustrate the changes in the number of Word documents available over a number of different domains (red) compared to the total number of documents available through all monitored domains (green).
Last modified: Sunday, November 13, 2005 3:16 pm
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.