[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re: DM: Datamining Definition...and Machine Learning Definition.From: Warren Sarle Date: Thu, 23 Mar 2000 13:08:33 -0500 (EST) > From: "Franklin Wayne Poley" <culturex@vcn.bc.ca> > ... > If we go with such a broad term then data mining/knowledge extraction > becomes synonymous with machine learning does it not? Unfortunately, data mining has become whatever the marketing people trying to sell expensive software define it to be. Machine learning is traditionally concerned with small, noise-free data sets, primarily with categorical variables. In recent years the ML people have shown more interest in noisy data and continuous variables, but they still seem to view noisy data as an aberration. Data mining is traditionally concerned with huge, noisy data sets with all kinds of messy variables. Primarily, the purpose of data mining is to create a predictive model for a specific target variable (such as customer purchasing, credit card fraud, etc.) or to see if there are any predictive relationships among a large number of variables (e.g., "associations and sequences", market basket analysis). Predictive models for noisy data were called "statistical models" before some marketing person came up with the term "data mining", which, by the way, used to be a derogatory term in the statistical literature. Secondarily, data mining is concerned with detecting outliers (anomalies, novelties), which is another application of statistical models. Ultimately, data mining is used to make decisions--usually business decisons but perhaps medical decisions or various other kinds of decisions. Making decisions based on noisy data is the province of statistical decision theory, which is used in the SAS Enterprise Miner product. So I choose to define data mining as the application of statistical decision theory to huge, messy data sets to maximize profits. -- Warren S. Sarle SAS Institute Inc. The opinions expressed here saswss@unx.sas.com SAS Campus Drive are mine and not necessarily (919) 677-8000 Cary, NC 27513, USA those of SAS Institute.
|
MHonArc
2.2.0