What is a Metadata Warehouse?
The more I think about it, the more I’m convinced that someone should have asked that question a long time ago – and it’s not a question that’s easy to answer. For me, the question emerged from a presentation that Mike Hoskins, the CTO of Pervasive, gave as the keynote to the Pervasive user conference.
I’ll have more to say about Pervasive in another posting, because they are innovating at a furious rate and there is quite a lot more to cover, but I have little doubt that the Metadata Warehouse is the most important innovation that Pervasive is currently presiding over. And I’ll make no attempt to Pervasive’s technology or approach to this, because I haven’t taken a deep enough look at it, I’ve only discussed the underlying concept in an all-too-brief conversation with Mike Hoskins.
The Metadata Warehouse
This concept is very powerful for several reasons but most of all because it cuts through a whole series of illusions. Let’s think in terms of corporate data. Is there a “corporate pool of data” that’s available to staff to use as they see fit in order to improve the efficiency of business processes? Well there is, in the sense that corporate computer systems have got data coming out of their ears and all of that data is stored somewhere – OK I’m being optimistic here – some of that data is probably not well looked after and may leak away with time, but most of it is stored.
But all of that data is not accessible in a single coherent way. Now if you’re going to say, “Well what about data warehouses” then my response is:
“Give me a break. The remarkable thing about data warehouses is that companies have so many of them – just like customer databases.”
Or if I’m not in a sarcastic mood, I’ll point out that most data warehouses only hold some corporate data and they hold it “star schema” or “snowflake schema” style – which means that the metadata has been engineered for the very specific context of data mining. Data warehouses are not repositories of “all the company’s data” and nor should they be. And that is why many companies have operational data stores, which are other (possibly large) heaps of data generated for access in a more real-time manner.
The idea that any organization has a single coherent data resource is an-as-yet-unrealized-dream. But just because it doesn’t exist doesn’t mean that it shouldn’t exist. To be honest such a thing would be damn useful. And that, of course is the point:
The only way to achieve a single coherent data resource is to build a Metadata Warehouse and to deploy well engineered “data middleware” to assemble the right data of the right quality in the right place at the right time.
That, by the way, is what Pervasive is about, as a company. That’s what they do and they’ve been doing it effectively for years. But, if I understand Mike Hoskins, there’s something new stirring in the company. They are staking a claim for intellectual leadership of their sector of IT.
« 1 2 or View All »














