How Does Google Protect Your Data in The Cloud?

July 22, 2011 Off By David
Object Storage
Grazed from ReadWriteWeb.  Author:  Dan Rowinski.

Google has one of the largest and most secure clouds in the entire industry. You do not often hear of a successful distributed denial of service attack against Google and rarely are Google applications hacked (unless, of course, it reportedly comes from the Chinese government). How does Google keep the data centers that comprise its cloud so safe and are they the gold standard in data protection?…

Adam Swidler, senior manager for Google Enterprise, laid out how the company keeps its cloud safe at the Cloud Control Conference in Boston this week. The measures that Google goes through are quite thorough. For instance, no Google clients or federal regulators are allowed inside of Google’s data centers. When it comes to tough nuts to crack on the Internet, Google’s cloud is about as tough as it gets.

Feet On The Ground

Not all Internet security is tied up in firewalls and honey pots and SQL barriers and the whatnot. In fact, believe it or not, there was a time before the Internet that when people mentioned security, images of large men with guns patrolling walls would come to mind.

Every Google data center has 24/7 guard support. They may not be manning watchtowers with Kalashnikovs, but they are present at all times doing internal and external patrols. There are alarms linked to the guard stations, closed circuit televisions, electronic key access and access logs. You would probably feel safer at a Google data center than in the floor of the Senate.

As we have seen before, the biggest threats to data security are people. Bradley Manning is the poster child for this with WikiLeaks, but IT departments have known for years that users are the biggest vulnerability.

So, every Google employee that works at a data center goes through an extensive background check and access is controlled with VP oversight. Google Apps are certified by FISMA (Federal Information Security Management Act) officials, but, beyond what it takes to obtain and maintain that accreditation, government employees are not allowed in.

Redundancy, Obfuscation and Structure

Google does not keep all of your files in one place but splits them up and stores them on multiple files on several machines. The file names are randomized so as to not match content-type to owner and each server disc contains hundreds of thousands of files. Even if you knew what you were looking for, it would be hard to find.

Google also obfuscates its data. This is not a simple encryption nor is it easily renderable back to clear text. Encryption can be hacked and readable language can be copied. Think of it like this: The data is made confusing and opaque so that it cannot be discerned, except by the server itself; it is like one of those pixilated pictures that you have to mess up your eyes to see that it, indeed, a sailboat.

The actual hardware – the servers – are custom-built on a Linux software stack. The discs themselves are organized and labeled efficiently. If a disc goes bad it is warped with a piston-like device called "the crusher" and then put through a disc shredder.

Privacy and Certification

Google invites third-party white hat hackers to try to penetrate the data centers from outside on a quarterly basis. Google products are hard to hack, as white hat hackers have seen in the Pwned conference in recent years where the Chrome browser could not be hacked. The Pwn2Own conference did not even invite it back (Google entered it anyway with a bounty). Chrome has indeed since been hacked, claims Vupen Security.

The third-party attacks are in addition to probes that come at pretty much every hour of the day, every week. Large corporations like Google, Apple, Amazon, the U.S. federal government, banks and financial institutions etc. are always under attack.

Google has the aforementioned FISMA certification (which both Windows Azure and Amazon Web Services have as well) along with SAS 70 Type II certification and U.S./E.U. Safe Harbor certification. Basically, if there is an important business or government security certification, Google Apps and data centers have it.

On the privacy front, Google takes pains to assure users that Google does not own the data stored in their cloud. On one hand, this is supposed to reassure the client that Google is not using its data for anything nefarious. On the other hand, Google is protecting itself from whatever harmful data could be stored in its stacks. Officially, Google is known as a "data processor." Deny ownership, deny liability, more or less. Google promises that data can be taken out of the cloud at any time and promises that it will be completely eradicated within 60 days (though usually much sooner).

When data is taken out of the cloud, the disc is not wiped to erase it. That would be a painful process for Google. Essentially the data is "broken" so that it no longer functions and its place on the disc is gradually rewritten (over a time span no longer than 60 days).

Most of this information is available at the Google Apps security page which includes  videos, FAQs and white papers on security. If you are a Google client curious about where and how your data is stored, it is a good idea to familiarize yourself with Google’s practices.