Amazon Web Services launches managed database service

January 18, 2012 Off By David
Grazed from InformationWeek.  Author: Editorial Staff.

Amazon announced today DynamoDB, a fully managed cloud-based NoSQL database service that builds on the company’s SimpleDB service by delivering faster, more consistent database performance to keep pace with the demands of ever-scaling cloud apps.

The secret sauce here is Amazon’s homegrown Dynamo non-relational database architecture, which the company built to suit the demands of its complex, service-oriented e-commerce architecture. Designed to be a highly reliable, ultra-scalable key/value database, Dynamo has  inspired such offerings as Red Hat’s Infinispan data grid technology [1] and Apache Cassandra [2]

But Dynamo, despite being more robust than SimpleDB, hasn’t enjoyed broader adoption because "it did nothing to reduce the operational complexity of running large database systems," according to Amazon CTO Werner Vogels [3].

Indeed, SimpleDB’s strength is its simplicity, as its moniker implies: It provides a straightforward table interface and a flexible data model while eliminating headaches associated with configuration, patching, replication, or scaling.

With DynamoDB, Amazon has attempted to bring together the best of both worlds: Dynamo’s superior scalability, performance, and consistency delivered as an easy-as-pie service, effectively eliminating the complexity of forecasting and planning database deployments. Adding capacity takes a few clicks via the management console.

DynamoBD addresses four of SimpleDB’s more significant shortcomings, according to Vogels:

  • With SimpleDB, users need to add dataset containers, called domains, in increments of 10GB.
  • SimpleDB indexes all attributes to each item stored in a domain, which means that every database write results in an update of not just the basic record, but all attribute indices. This can result in performance hiccups due to latency, especially as a dataset increases in size.
  • As is the tendency among NoSQL databases, SimpleDB takes an "eventually consistent" approach to data presentation, which can be up to a second in duration.
  • SimpleDB’s pricing, based on "machine hours," has proven complex.

DynamoDB differs from SimpleDB in terms of scalability in that there are no predefined limits to the amount of data a given table can store. Developers can store and retrieve any amount of data, and DynamoDB will spread that data across hundred or thousands of servers over multiple Availability Zones to meet a user’s storage and throughput requirements.

DynamoDB uses a couple of techniques to deliver better, more consistent performance, according to Amazon. It has high throughput at a very low latency in part because the service is built on solid state drives, which, according to Vogels, "helps to optimize high performance even at high scale."

Moreover, Dynamo’s write operations entail updating only primary key indices. That reduces the latency of both read and write operations.

"Most importantly, DynamoDB latencies are predictable," Vogels explained "Even as datasets grow, latencies remain stable due to the distributed nature of DynamoDB’s data placement and request routing algorithms."

Amazon DynamoDB also aims to give developers a high degree of flexibility in that it does not force them to use a particular data model or consistency model. Tables do not require a fixed schema; rather, each data item can have any number of attributes — even multi-value attributes. Developers can opt for stronger consistency models when accessing the database, and they can take advantage of the atomic increment/decrement functionality of DynamoDB for counters.

DynamoDB integrates with Amazon Elastic MapReduce service, through which users can perform complex analytics of their large datasets using a hosted Hadoop framework on Amazon Web Services. Organizations can use MapReduce to analyze datasets stored in DynamoDB and archive the results in Amazon Simple Storage Service.

Price-wise, Amazon DynamoDB storage runs $1 per GB per month. Requests are priced based on how much capacity is reserved: $0.01 per hour for every 10 units of Write Capacity and $0.01 per hour for every 50 units of Read Capacity.