Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: Data Quality Metrics


From: Rebecca Buchheit
Date: Fri, 9 Jun 2000 16:35:02 -0400
Hello --

I am trying to learn more about data quality metrics and how to "rate" the
quality of a given data set.  Specifically, I would be using this knowledge
in a case study to rank and describe the quality of several different data
sets.  What I would like to do is come up with a "formula": given what the
client wants to do with his/her data, we decide which metrics are most
important and rate the data set using these metrics.  This would give us an
ranking of the overall "quality" of the data, which in turn could suggest
where we should put most of our effort into data cleaning or whether the
data set is suitable for data mining at all.

I have read Wand & Wang's "Anchoring Data Quality Dimensions in Ontological
Foundations" (Communications of the ACM [1996], 39:11).  They provide a nice
framework for defining dimensions of data quality, but do not present any
specific applications of their framework.  Does anyone have any suggestions
for further reading?  I am interested in specific applications of data
quality metrics to real/simulated data sets, as well as other competing
frameworks.

Thank you very much for your time.  If the responses warrant, I will
summarize them and post them to the list for everyone to see.

Rebecca Buchheit
rb6g@andrew.cmu.edu
Doctoral Candidate : Computer-Aided Engineering and Management
Civil Engineering  : Carnegie Mellon University
http://www.contrib.andrew.cmu.edu/~rb6g




[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Email: firschng@nautilus-systems.com
Mail converted by MHonArc 2.2.0