News Stay informed about the latest enterprise technology news and product updates.

The Humpty Dumpty data dilemma

Corporate data is increasingly falling apart and execs are constantly trying to piece it back together again. But companies have plenty of choices when it comes to getting scattered data in order.

If you've got information scattered in analytic silos throughout your enterprise, you're not alone.

Plenty of organizations have multiple analytic silos, according to a report from The Data Warehousing Institute (TDWI).

The Seattle-based researcher talked to 650 IT professionals at companies of all sizes, though more than one-third work at organizations with more than $1 billion in annual revenue.

For more information

See why data may be the dirtiest four-letter word in CRM

Find out if your company could benefit from hiring a data steward

Only 11% of those surveyed have completed a project to consolidate silos. However, 56% are designing or implementing a consolidation project and another 26% are exploring the idea.

Of course, actually completing a consolidation project is another story.

"In reality, very few companies actually finish these projects. The business is always changing," said Wayne Eckerson, TDWI's director of research and author of the report. "In this area, it truly is an ongoing process. In many ways data warehousing was established to help companies reintegrate themselves because companies are continuously disintegrating themselves."

Eckerson called it the Humpty Dumpty model. Just like in the nursery rhyme, corporate analytic data is increasingly falling apart and execs are constantly trying to piece it back together again.

On average, organizations have to consolidate two data warehouses, six independent data marts, four and a half operational data stores and 28 and a half spreadmarts (defined as spreadsheets or desktop databases), the report found.

The main challenge in consolidating data into a central repository is not so much technological as it is plagued by process, Eckerson said. Executives get apoplectic when managers spend more time in meetings arguing about whose data is right rather than developing strategies and plans to achieve corporate goals, according to the report.

"The question is what is the scope of their agreement?" Eckerson said. "It really is the hard work of rolling up your sleeves and getting representatives of major constituencies together."

The challenges and benefits become more pronounced for CRM users, Eckerson noted. CRM systems only provide value with a unified data structure.

"You can't have that until people agree what a customer is," Eckerson said. "It's not easy. Is it a supplier? Someone who paid you before? A prospect? There are a lot of things to hash out."

Companies can use several strategies for consolidating analytic silos, which Eckerson lays out in the report. The most common is starting from scratch by simply building a new warehouse instead of designating or merging existing warehouses.

"When you bring a builder in to work on your house a lot of times they say 'let's tear this down and start from scratch because I don't want to tear down these walls and find problems I haven't budgeted for,'" Eckerson said.

The same principal applies with building a consolidated warehouse.

Another option, which Eckerson terms the "designate and evolve" approach, is commonly used with acquisitions. Organizations will use an existing warehouse and migrate information from the new system.

The "backfill" approach requires implementing a staging area behind existing data marts. When corporate politics make starting over impractical, a backfill system sitting behind the existing data marts becomes a single source of information.

Some organizations are using a virtual data warehouse, Eckerson said. Under this scenario, a tool will query multiple sources and pull the sources together in real time as if they came from one place. While this can be done quickly and cheaply, Eckerson warned of several limitations. Virtual data warehouses work best when data volume is small and relatively clean. Often these virtual warehouses are used to determine what sort of information users really want and using that data point to create a better warehouse or data mart.

Other strategies include a super data mart of data marts, which provides a view of distributed data marts. Another is "rehosting," or moving existing analytic structures entirely into a single platform. Rehosting allows organizations to reduce servers and staff, but doesn't change the application in any way.

Eckerson was pleasantly surprised to see some of the ROI that came from these projects.

The average ROI for consolidation projects is $3.34 million with an average payback period of 2.1 years, according to the report. ROI generally comes from the savings associated with shutting down independent data marts minus the migration costs. TDWI found it costs $614,000 a year to maintain an independent data mart and nearly $1.6 million to maintain a data warehouse.

Dig Deeper on CRM strategy and implementation

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.