So it was Saturday night and I was doing what every 22 year old was doing… doing Google searches for compression algorithms. I couldn't remember much about the acronym for this one particular algorithm other than it was three letters and began with "P", so in a half asleep, bored daze I was trying different combinations of letters.
After a few searches around 23:16 Singapore time I noticed something peculiar: underneath all the headings for every single result, Google was reporting that "this site may be harmful to your computer".
ASIDE: I knew something was up when even links to Wikipedia were being given the same suspicious treatment… I chuckled and assumed this must have been because of the comparatively poor performance of Google Knol highlighted recently!
It wasn't long before all the major wire services and news companies were picking up the story. I had no idea the little thing I had witnessed would become such an overnight news sensation. CNET ran an initial story (Google taking security a little too seriously?) and follow-up story (Google warns entire Internet is malware), but the BBC summarised the debacle the best in their "Human error" hits Google search report:
For a period on Saturday, all search results were flagged as potentially harmful, with users warned that the site "may harm your computer".
Google attributed the fault to human error and said most users were affected for about 40 minutes.
The internet search engine works with stopbadware.org to ascertain which sites install malicious software on people’s computers and merit a warning.
The list of malevolent sites is regularly updated and handed to Google.
When Google updated the list on Saturday, it mistakenly flagged all sites as potentially dangerous.
Marissa Mayer, VP, Search Products & User Experience at Google posted and revised an entry on the offical Google Blog:
If you did a Google search between 6:30 a.m. PST and 7:25 a.m. PST this morning, you likely saw that the message "This site may harm your computer" accompanied each and every search result. This was clearly an error, and we are very sorry for the inconvenience caused to our users.
What happened? Very simply, human error. […] We maintain a list of [malware] sites through both manual and automated methods. We work with a non-profit called StopBadware.org to come up with criteria for maintaining this list, and to provide simple processes for webmasters to remove their site from the list.
We periodically update that list and released one such update to the site this morning. Unfortunately (and here’s the human error), the URL of "/" was mistakenly checked in as a value to the file and "/" expands to all URLs.
Really gives you an idea about how valuable and critical a site like Google is thesedays that an error like this can generate so much news coverage in such a short amount of time.
This incident has also increased my already heightened doubt and scepticism I have for most content filtering and malware warning systems. There has been much publicity about the ethical side to warning users of and blocking sites with questionable content, but this is an example of the technical side of such a system failing. While this is an extreme case, mistakes of this kind are unavoidable.
It also chills my blood to think about another scenario: if all it took was a malformed string on the server side, what other mistakes have been made in the past that perhaps haven't been reported? I could go on for paragraphs about this, but I think you're smart enough to visualise the implications of this.
As for the algorithm I was looking for? Turns out it was PAQ. Not Bill Kurtis.