How safe are the clouds?

In the last week, Carbonite, a large backup vendor, was reported to have lost data for a significant number of users. In articles like this and this, journalists and users have been asking, rightly so, “How safe is data in the cloud?” If companies providing Cloud Computing are charging money to manage others’ data, we should expect them to do a better job than the average user or company. But how much safety can we reasonably expect?

As someone who uses Cloud-based services each day, I follow these discussions with interest. How much safety do I expect from using the Cloud? I expect to have full access to every bit of data I have at all times. No not having access to some of my data and not the rest, requesting a restore and waiting for an email when it’s ready, no waiting for a DVD in the mail. In a nutshell, complete and instant access. But is this a reasonable expectation for all data managed through Cloud Computing?

In my view, no. “Cloud Computing” is a type of technology, just as a “car” is a type of vehicle. While a Volvo and a Pinto are both cars, few would argue that they can expect the same level of safety. Before I buy a car, I ask basic questions to find out how many airbags there are or whether there is traction control because I know each car is different.

Cloud Computing is exactly the same. Problems that affect one Cloud Computing solution may not affect another. I reasonably expected complete and instant access because I did my homework.

Below is a basic set of questions I use to determine how safe my data is. I’ve included answers for Syncplicity as well for those curious.

 

1. How is the data stored?

You want at least three geo-replicated copies. If you have three copies, if something fails, during the time it takes an automated system or a human person to fix the problem, you still have multiple copies elsewhere for safety. A secondary failure happening after a first is surprisingly likely. Geo-replication means the data is stored in multiple data centers. Natural disasters, power outages, and network failures can all disable a data center. These types of problems, while rare, also typically take days to resolve.

What Syncplicity does: For a higher level of data safety, Syncplicity keeps four geo-replicated copies of all data.

2. Are verifications being done on the data?

Storing data is easy. It’s making sure you can get it back, one month, one year, or one decade later that’s hard. If you’ve ever burned a CD and found you couldn’t read it back later, you’ve experienced bit rot. You want to find out how your provider verifies your data hasn’t suffered bit rot and is still accessible.

What Syncplicity does: Syncplicity works with active data so our systems are constantly writing and reading the data stored. On each read, a verification test is performed to ensure that the data is accessible and the same as what was stored so you can be assured that your data is always available to you when you need it.

3. How is the data managed?

You want to make sure that your data is encrypted, there is strong security and access controls, and a strong privacy policy. One thing to look out for is that the privacy and data disclosure policies of many companies cover contact information, but specifically exclude your actual data. Thus, while they aren’t allowed to give your home address to anyone, they could give them your tax return. This is just plain wrong, but something to watch for.

What Syncplicity does: Data stored in Syncplicity is transferred and stored fully encrypted with the same technologies used by banks and the military for classified information. Specifically, data is transferred over 128bit SSL and stored using the Advanced Encryption Standard (AES). Syncplicity’s Privacy Policy is also one of the strongest in the industry. We make it a point to explicitly extend our privacy policy to cover all stored data. Viral Tarpara and I discussed what this all means in more detail in Understanding Privacy in Cloud Computing.

4. What happens if the company disappears tomorrow?You’ll want to make sure they have a good story as to how to get your data if they go out of business or the service fails in a prolonged manner.

What Syncplicity does: Syncplicity provides data management rather than pure data storage. In the case that Syncplicity disappears tomorrow, you still have full access to your data on each computer, device, or web application that you use. What you lose is the accessibility and management functions such as access to previous versions and revision history, access to your data from your cell phone, and simple sharing.

As luck would have it, my own laptop died a horrible death this week and my faith in the clouds and Syncplicity got put to the test. I’m happy to report that all of the cloud-based services I used, from hosted e-mail to online code repositories to Syncplicity, passed with flying colors.