It has been suggested that if one was somehow able to change history so that aspirin had never been discovered until now, it would have died in the lab and stand no chance of FDA approval. In a report from the Manhattan Institute, they write that no modern drug development organization would touch it. Similarly, if we knew the power that Google would have in 2008 with its ability to aggregate and correlate personal data, it is arguable that various regulatory and privacy bodies would never allow it to exist given the extensive privacy issues.
In a fascinating and eye-opening new book Googling Security: How Much Does Google Know About You?, author Greg Conti explores the many security risks around Google and other search engines. Part of the problem is that in the rush to get content onto the web, organizations often give short shrift to the security and privacy of their data. At the individual level, those who make use of the innumerable and ever expanding amount of Google free services can end up paying for those services with their personal information being compromised, or shared in ways they would not truly approve of; but implicitly do so via their acceptance of the Google Terms of Service.
While the book focuses specifically on Google, the security issues detailed are just as relevant to Yahoo, MSN, AOL, Ask and the more than 50 other search engines.
Until now, Google and security have often not been used together. As an example, my friend and SEO guru Shimon Sandler has a blog around search engine optimization (SEO). In the over three years that his blog has been around, my recent post on The Need for Security in SEO was the first on topic of SEO security. Similar SEO blogs also have a very low number (and often no) articles on SEO and security. Sandler notes that when he mentions privacy issues around search to his clients, it is often the first time they have thought of it.
The book opens with the observation that Google's business model is built on the prospect of providing its services for free. From the individual user's perspective, this is a model that they can live with. But the inherent risk is that the services really are not completely free; they come at the cost of the loss of control of one's personal information that they share with Google.
The book lists over 50 Google services and applications which collect personal information. From mail, alerts, blogging, news, desktop, images, maps, groups, video and more. People are placing a great deal of trust into Google as each time they use a Google service, they are trusting the organization to safeguard their personal information. In chapter 5, the book lists over 20 stated uses and advantages of Google Groups, and the possible information disclosure risks of each.
In the books 10 chapters, the author provides a systematic overview of how Google gets your personal data and what it does with it. In chapter 3, the book details how disparate pieces of data can be aggregated and mined to create extremely detailed user profiles. These profiles are invaluable to advertisers who will pay Google dearly for such meticulous user data. This level of personal data aggregation was impossible to obtain just a few years ago, given the lack of computing power, combined with the single point of user data. The book notes that this level of personalization, while golden to advertisers, is a privacy anathema.
Chapter 6 is particularly interesting in that it details the risks of using Google Maps. Conti explains that the privacy issue via the use of Google Maps is that it combines disclosure risks of search and connects it to mapping. You are now sharing geographic locations and the associated interactions. By clicking on a link in a Google map, the user discloses and strengthens the link between the search they performed and what they deemed as important in the result. By aggregating source IP addresses and destinations searches, Google can easily ascertain confidential data.
After detailing over 250 pages of the risks of Google and related services, Chapter 9 is about countermeasures. Short of simply not using the services, the book notes that there is no clear solution for protecting yourself and company from web-based information disclosure. Nonetheless, the chapter lists a number of things that can be done to reduce the threat. Some are easier, some are harder; but they can ultimately add up to a significant layer of protection. Chapter 9 details 11 specific steps that help users appreciate the magnitude of their disclosures and make informed decisions about which search services to use.
Googling Security: How Much Does Google Know About You? is an important book given that far too many people do not realize how much personal information they are disclosing on a daily basis. An important point that the book makes is that small information disclosures are not truly small when they are aggregated over the course of years. Advances in data mining and artificial intelligence are magnifying the importance of the threat, all under the guise of improving the end-user experience. The book emphasizes the need to evaluate the short-term computing gains with the long-term privacy losses.
The final chapter notes that apathy is the enemy. As a user becomes aware of the magnitude of the threat, they will see it grow every day. But the next step is to take action. Be it with technical countermeasures, taking your business where privacy is better supported, or petitioning lawmakers.
As to the underlying question, "how much does Google know about you?", the answer is that it is a colossal amount, far more than most people realize. For anyone who uses the Internet, Googling Security should be on their list of required reading. The risks that Google and other search engines present are of great consequence and can't be overlooked. If not, privacy could slowly be a thing of the past.