Leave it to Gartner. The Stamford, Conn.-analyst firm, which reported widespread CRM failures several years ago, has turned its skeptical eye toward data warehousing.
In a recent report, Gartner predicts 50% of data warehouse projects through 2007 will have limited acceptance or be outright failures, as a result of lack of attention to data quality issues.
The solution, according to one Gartner analyst, is a data quality firewall.
"This first started as a way to keep bad data out of your data warehouse," said Ted Friedman, a principal analyst at Gartner. "At the edges of your enterprise, you need something to sniff out data quality issues that are coming from your suppliers."
A data quality firewall acts similar to a network security firewall: It can enable packets to come through on certain ports by rejecting certain data quality issues while allowing others to be stored in the warehouse. The firewall sits between the data sources and the data warehouse and works within the extract, transform and load (ETL) process, Friedman said.
The firewall is not just a concept either, Friedman said. One large insurance company has established a firewall and is using it to place the responsibility for cleaning up bad data on the source systems.
"In the early stages, the firewall tends to be a fairly reactive thing," Friedman said. "You don't know what data is like or where it's bad. As you cover hot spots [for poor data quality], then you start to define business rules to watch those particular areas."
For example, a business that receives a lot of records with missing e-mail addresses can establish rules to block those entries. There are several data quality tools that are used to build these firewalls, such as the QuickAddress Pro from QAS, which validates addresses coming into a database.
Before launching a data quality firewall, companies should first measure their data and find the spots where the mistakes are and where they are causing them the most pain. Additionally, the line of business people need to be involved, not just IT, Friedman said.
"If you do it right, data is the responsibility of both business and IT," Friedman said. "IT implements the technological bits of monitoring for poor quality data, and business people will see the results of what gets kicked out and take action."