AOL just released the logs of all searches done by 500,000 of their users over the course of three months earlier this year. That means that if you happened to be randomly chosen as one of these users, everything you searched for from March to May (2006) is now public information on the Internet. The released data includes 20 million web queries from 500,000 AOL users. So if you are one of the lucky winners, start crying!
UPDATE: AOL has realized that they messed up so they are taking it down, but by now there should be mirrors all over the net. (HINT: Search in Digg or Reddit comments, try also in P2P - The MD5 of the file is 31cd27ce12c3a3f2df62a38050ce4c0a).
UPDATE 2: Too late for regrets AOL folks, here you can find some mirrors to download the log files.
- It’s a huge violation of their users privacy, and also it could be a violation of their Privacy Policy (I’ve managed to find telephone numbers, social security numbers and other private queries that users look up in AOL).
- As said in Techcrunch, people often search their family names, and if you combine it with a query such as “buy marijuana” you have enough to start wondering about people’s life, or worse.
- The data is supposedly anonymized, which in AOL-speak means the screen-name is replaced by a unique user number. Anyone a little bit familiar with data mining knows what this means, and obviously some commenters on the AOL blog have already put two and two together, “outing” certain users whose identity was easy to find based on the search patterns. - Zoli’s blog.
- It’s not going to be long until people are going to start spamming with queries found on these log files.
- After all the fuzz and the debate about DOJ?s demand for ?anonymized? search data last year that cased all sorts of pain for Microsoft and Google, it’s kind of ironic that AOL does exactly what the DOJ’s wanted with no hassle.








1 response so far ↓
1 Niels // Aug 7, 2006 at 10:45 am
The archive is now mirrored at http://aol.br3f.net/AOL-data.tgz
Leave a Comment