Google Compute Engine rocks the cloud
August 22, 2012Grazed from InfoWorld. Author: Peter Wayner.
You’re sitting around. You have some computing to do. Ten years ago, you would ask your boss to buy a rack or two of computers to churn through the data. Today, you just call up the cloud and rent the systems by the minute. This is the market that Google is now chasing by packaging up time on its racks of machines and calling it the Google Compute Engine.
Google took its sweet time [1] entering this corner of the cloud. While Amazon, Rackspace, and others started off with pay-as-you-go Linux boxes and other "infrastructure" services, Google began with the Google App Engine [2], a nice stack of Python that held your hand and did much of the work for you. Now Google is heading in the more general direction and renting raw machines too. The standard distro is Ubuntu 12.04, but CentOS instances are also available. And you can store away your own custom image once you configure it…
Google’s big selling point
Why rent machines from Google instead of Amazon or Rackspace or some other IaaS provider? Google claims its raw machines are cheaper. This is a bit hard to determine with any precision because not everyone is selling the same thing despite claims of computing becoming a commodity. Google sells its machines by the Google Compute Engine Unit (GCEU), which it estimates is about a 1GHz to 1.2GHz Opteron from 2007.
All of Google’s machines rent for 5.3 cents per GCEU per hour, but that isn’t really what you pay. The smallest machine you can rent from Google today, the so-called n1-standard-1-d, goes for 14.5 cents per hour. That’s because the n1-standard-1-d — which comes with one virtual core, 3.75GB of RAM, and 420GB of disk space — is equivalent to 2.75 GCEUs, according to Google. You can get machines with two, four, and eight virtual cores all at the same price per GCEU.
These numbers are bound to evolve soon according to a member of the Google Compute Engine team. The product is said to be in "limited preview," and as it grows more polished, the company will probably experiment with adding more options with more or less power.
Is 5.3 cents per GCEU a good deal? It depends upon what you want to do with your machine. Rackspace prices its machines by the amount of RAM you get. It has stopped selling the anemic 256MB RAM VMs, but rents its 512MB boxes at only 2.2 cents per hour or $16.06 per month. If you want a machine with 4GB from Rackspace, it will cost you 24 cents each hour, about $175 per month.
Is that a better deal? If your computation doesn’t need the RAM, a basic instance from Rackspace is much cheaper. Even if the CPU might not be as powerful, you would be better off with a cheaper machine. But I suspect many will need fatter machines because modern operating systems suck up RAM like a blue whale sucks up krill.
Google Compute Engine gives you a clean, Google-esque Web dashboard to create instances, assign them to zones, and monitor their status. So far, you can choose from Ubuntu and CentOS images. [6]
Google Compute Engine gives you a clean, Google-esque Web dashboard to create instances, assign them to zones, and monitor their status. So far, you can choose from Ubuntu and CentOS images.
After you get past the differences over RAM and disk space, the Google machines are meant to be essentially the same as the machines from Amazon or Rackspace — or even the machines you might buy on your own. Like Amazon and Rackspace, Google makes it easy to start off with Ubuntu; after that, you’re talking to Ubuntu, not Google’s code. There are differences in the startup and shutdown mechanisms, but these aren’t substantial. More substantial is Google’s inability to snapshot persistent storage, as you can in Amazon, but Google promises this is coming soon.
If you’re migrating from Amazon or Rackspace, you’ll need to rewrite your scripts because the APIs are full of linguistic differences, even if they offer most of the same features.
Google Compute Engine ins and outs
Another big part of the equation is bandwidth. Google doesn’t charge for ingress, but it has a fairly complicated model for egress. Shipping data to a machine in the same zone in the same region is free, but shipping it to a different zone in the same region is one penny per gigabyte. Then the cost for letting the data "egress" to the Internet depends upon whether it’s going to the Americas/EMEA or the APAC (Asia and the Pacific). For what it’s worth, egressing the data to some website visitor from the APAC is almost twice as expensive as egressing it to someone in the United States. The costs are set on a sliding scale with discounts for big egressers.
While the complexity of the pricing table will send the purchasing managers to their calculators, it’s interesting what Google is trying to do with this scheme. By making intermachine communications free, Google is no doubt banking on people using the racks in the same zones to actually work together on solving problems. In other words, Google is giving us the tools for stitching together our own supercomputers.
In general, Google is doing a good job of making some of the dangers of the cloud apparent. Like compute instances in Amazon, Rackspace, and other IaaS clouds [7], each Google instance comes with "ephemeral disk," a name that makes the storage sound more fragile than it really is. Keep in mind that the file system that comes with your cloud computer — be it on Amazon, Rackspace, or Google — is not backed up in any way unless you code some backup routines yourself. You can run MySQL [8] on your cloud box, but the database won’t survive the failure of your machine, so you better find a way to keep a copy somewhere else too.
Calling the storage "ephemeral" makes it obvious that the data might go elsewhere during a real failure or even a "maintenance window." If anything, the name might overstate the dangers, but it all becomes a gamble of some form or another. The solution is to purchase separate "persistent disk" space and store your information there. Or you might want to put it in Google Cloud SQL, the BigQuery data store [9], or one of the other services offered by Google.
If words like "ephemeral" still sound off-putting, the documentation says Google will negotiate service-level agreements for enterprise customers that begin with promises of 99.95 percent uptime.
Google is also making the dangers of location apparent. One section of the documentation addresses just how you should design your architecture around potential problems. The various zones and regions may go down from time to time, and it’s your responsibility to plan ahead for these issues. Google makes the costs of shipping the data transparent, so you can come to intelligent decisions about where to locate your servers to get the redundancy you need.
Ready integration with other Google services is one of Compute Engine’s main attractions. It’s just one of 46 services that you can access through Google’s Developer API. [10]
Ready integration with other Google services is one of Compute Engine’s main attractions. It’s just one of 46 services that you can access through Google’s Developer API.
A Compute Engine with a view
Google Compute Engine is just one part of the Google APIs portal, a grand collection of 46 services. These include access to many of Google’s biggest databases such as Books, Maps, and Places, as well as to some of Google’s lesser-known products like the Web Fonts Developer API.
I suspect many developers will be most interested in using Google Compute Engine when they want to poll these Google databases fairly often. While I don’t think you’re guaranteed to be in the same zone as the service you want, you’re still closer than when traveling across the generic Web. Google offers "courtesy" limits to many of these APIs to help out new developers, but you will end up paying for the best services if you use them extensively. These prices are changing frequently as Google and the developers try to figure out what they’re really worth.
Google says some experimenters are already pairing the Compute Engine with the App Engine to handle expensive computations. In one of the experiments, Google worked with a biology lab to analyze DNA [PDF] [11]. The data was uploaded through an App Engine front end, then handed over to a block of Compute Engine cores to do the work. The Compute Engine machines were started up when the data arrived, and they were shut down and put back in the pool as soon as their work was done.
You can start and stop your machines by hand and track them with the Web portal, but I suspect many will end up using the command-line tool. Google distributes some Python code that handles most of the negotiations for reserving, starting up, and stopping servers. While the Web portal is OK for small jobs, the ability to easily write scripts makes the command-line version more useful.
The command-line tool is also more powerful. You can create instances through the Web GUI, but there’s a limit to how far you can go. I couldn’t figure out how to log in with SSH through the portal, then I switched back to the command line. Perhaps Google should check out some of the HTML5-based tools like FireSSH [12] that integrate SSH with a Web page. The only real challenge is finding a good way to hold the SSH keys.
One of the more interesting features is the way to bind metadata to each computer. Google is clearly intending for people to write their own automatic routines for bringing machines online and off. If you want your software to be self-aware, it can look at the metadata for each instance, and the instance can also read the metadata about itself. This lets you pass in configuration information so that each new machine is not born with a clean slate.
If you want to build your own collection of Linux boxes, Google Compute Engine offers a nice, generic way to buy servers at what — depending on the size of compute instance you need — can be a great price. The most attractive feature will probably be the proximity to the other parts of the Google infrastructure. Google is as much a data vendor as an advertising company, and the collection of APIs is growing nicely. I can see how some companies will want to run their computational jobs in the Google cloud just to be closer to these services.