[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
DM: Data Quality MetricsFrom: Rebecca Buchheit Date: Fri, 9 Jun 2000 16:35:02 -0400 Hello -- I am trying to learn more about data quality metrics and how to "rate" the quality of a given data set. Specifically, I would be using this knowledge in a case study to rank and describe the quality of several different data sets. What I would like to do is come up with a "formula": given what the client wants to do with his/her data, we decide which metrics are most important and rate the data set using these metrics. This would give us an ranking of the overall "quality" of the data, which in turn could suggest where we should put most of our effort into data cleaning or whether the data set is suitable for data mining at all. I have read Wand & Wang's "Anchoring Data Quality Dimensions in Ontological Foundations" (Communications of the ACM [1996], 39:11). They provide a nice framework for defining dimensions of data quality, but do not present any specific applications of their framework. Does anyone have any suggestions for further reading? I am interested in specific applications of data quality metrics to real/simulated data sets, as well as other competing frameworks. Thank you very much for your time. If the responses warrant, I will summarize them and post them to the list for everyone to see. Rebecca Buchheit rb6g@andrew.cmu.edu Doctoral Candidate : Computer-Aided Engineering and Management Civil Engineering : Carnegie Mellon University http://www.contrib.andrew.cmu.edu/~rb6g
|
MHonArc
2.2.0