A Need for Big Data Speed

June 8, 2011 Off By David
Object Storage
Grazed from IT Business Edge.  Author: Michael Vizard.

One of the lesser talked about implications of Big Data is the simple fact that the processing of data that is orders of magnitude higher is going to put a lot of strain on both middleware and the underlying networks that much of that middleware relies on.

This is obviously an issue that will be need to addressed because suddenly how your middleware providers actually process data is going to matter, especially if that data needs to move across a wide area network. As is often the case with cloud computing, the costs of transferring all that Big Data could have a lot of implications for the networking budget.

In fact, the folks at Informatica are betting that the combination of Big Data and cloud computing is going to create a lot of interest in version 9.1 of the company’s namesake platform. Announced this week, the new release has been optimized to more efficiently partition Big Data, while also making use of more in-memory caching to boost performance, which according to Informatica CTO James Markarian is part of Informatica’s “No Data Left Behind” commitment to data integration performance.

From an IT perspective, Markarian also noted that one of the primary benefits of Informatica 9.1 is that there is no need to learn how to use the MapReduce interface in order to work with sources of Big Data such as Hadoop. In fact, one the primary issues with Big Data performance can be traced back to the low-level nature of the MapReduce interface.

It’s pretty clear at this point that the amount of data that IT organizations will soon routinely be working with is going to rise exponentially. But that massive explosion of data is not going to be seen as a valid excuse for compromising performance, which before long will have IT organizations rethinking their entire IT strategies from top to bottom.