rPath urges: Avoid Cloud Computing “Image Sprawl”

November 16, 2011 Off By David
Object Storage
Grazed from Sys Con Media.  Author: Roger Strukhoff.

Brett Adam, CTO and SVP/Engineering of rPath, discussed "the seven deadly sins" of cloud applications at the recent Cloud Expo in Santa Clara. The sins had much to do with lack of automation and focusing on infrastructure per se rather than the applications in your cloud…

One of them – "golden images in a cloud environment" – stood out among the rest. Brett was quite adamant during his presentation about the particularly heinous nature of this sin. I didn’t quite grasp the significance of it, so I followed up with him on this point.

He remained passionate about this point in responding to my query. Here is his response:

"The issues with storing and managing images are many. Here are just a few:

* An ‘image library’ consumes large quantities of disk space, which only grows over time as new versions of images are created. Old images must be kept around for posterity since deployments may require any version, and audit and compliance may need them as well.

* Images are monolithic with regards to upgrades: once a system is deployed from an initial image, updating the image in the library won’t update the systems. Other mechanisms must be used to effect updates to running systems. This increases costs.

* Images can’t answer questions such as ‘where is this software component in use?’. Often turns up in security and patch management use cases. Increases the cost of managing such topics, typically involving ‘scanning’ tools and other ‘reverse engineering’ approaches. These tools are all distinct from the way the image was produced.

* Images beget other images: image sprawl occurs when teams want to tweak the image the obtained from another group. Pretty quickly there’s ‘golden VMs’ all over the place. The same update problems then occur: the base image one team used gets updated, but the twelve images that were tweaked from it can’t be updated in the same way. Compliance issues and costs escalate.

* Focusing on images as the primary artifact, adding all sorts of ‘workarounds’ for the above issues, distracts from the real problem which hides huge costs: how are the images constructed in the first place? How do changes get managed from dev to ops?

* Images are useful for rapidly provisioning many identical instances of a particular version of a software stack. That should, however, be their sole role. They should be generated on demand from a version controlled manifest and only kept around as a ‘cache’ item for some short period of time."

So now I’ve added "image sprawl" to my list of things to know about when it comes to deploying cloud computing. Who else has recognized this problem?