Why EMC Acquired Greenplum (Hint: EMC Loves Data)
EMC just bought GreenPlum (for $undisclosed.) It announced immediately that “Greenplum would form the foundation of a new data computing product division within EMC’s Information Infrastructure business.”
So what does that mean?
Most commentators are suggesting that EMC is making an entry in to the big BI space. And indeed with GreenPlum that’s mostly what they get. But technology-wise, it’s more than that.
High Volume Fruit
I’ve been tracking Greenplum for a couple of years now. Greenplum’s technology is based on an open-source database combined with a massively parallel processing architecture that can exploit cheap commodity servers. (see white paper). They have proven the capability and rolled up a list of impressive clients including: NASDAQ OMX, Skype, Equifax and T-Mobile.
There are several reasons why I have tracked the company and the product. The product actually came to market in 2006 as a commercial open-source pure play with the unlikely mission:
To enable companies to manage all of their data, and make it useful.
To me that sounds suspiciously like the “original database vision,” which was to have a single data layer (if not a single database) for the whole of the organization. And that idea hit the dirt before relational databases were born.
But hold on a moment. Maybe after 40 years of Moore’s Law coupled with cheap commodity servers…
Let’s take a quick look at where Greenplum is and what it has achieved.
- There are over 100 global enterprise Greenplum customers.
- The company is currently growing at over 100% and acquiring customers at a faster pace than rivals Netezza and Teradata.
- It is now in version 4.0 (more of which later) and it claims to be ”a genuine floor-sweep replacement option for Teradata, Oracle, DB2 and SQL Server.”
If you are interested in using Greenplum, you can try it for nothing. This is a neat sales ploy by Greenplum to provoke adoption. Greenplum prices a single server license at $0.00, choosing to charge real dollars only when the number of servers grows. This also unashamedly positions Greenplum as a high volume product with a low TCO.
The Chorus
There’s much more to say about Greenplum on the BI side, most of it good, but let’s just focus here on the cloud. In April Greenplum announced Greenplum Chorus, which is a cloud architecture for deploying Greenplum.
It’s a data cloud, or what you might imagine a data cloud to be. I tend to think of Greenplum Chorus as a data fabric, a data domain that holds all the shared data of an organization, whether data marts or data warehouses or operational data stores or whatever. “One ring to rule them all”, so to speak.
It’s clear that this, rather than a BI database, is what EMC has bought to add to its portfolio and with EMCs penetration of the corporate data center, it’s easy to imagine it marketing a “full sweep replacement for Oracle, DB2, SL Server et al. Whether it will be able to establish Greenplum in this way is another question.
Of course no company is going to migrate all its databases. It’s just never the most pressing thing to do in any data center. But they may strategically adopt Greenplum. I know that if I were running a data center I’d take a serious look at doing that.
This looks to be a good acquisition by EMC; very synergistic and delivering real potential. The Cisco/VMware/EMC partnership will be stronger for it.
If you’d like to download a white paper on Greenplum, click here.




















I am trying to contact Robin Bloor but the [Contact form] link on his many web sites is uniformly absent. Please advise! Thanks!
Sorry about that, we’ll fix it. However I now have your email, so I’ll contact you.