Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: discretization


From: Ronny Kohavi
Date: Mon, 18 Aug 1997 12:36:58 -0400 (EDT)

Bob> as decision trees are much easier to induce than generalized
Bob> classifiers, many people automatically (and blindly) discretize
Bob> their continuous variables prior to the induction process.

Bob> does anyone know of general discussions of this discretizing or
Bob> quantizing process? how should variables that represent counts or
Bob> frequencies be treated?  what about the situation where all but
Bob> one of the cases have the same value for a variable, should it be
Bob> treated as continuous?

There's an overview paper of discretization methods in

Dougherty, J., Kohavi, R. and Sahami, M., Supervised and unsupervised
discretization of continuous features. Machine Learning 1995.

and another paper that compares the newer optimal error minimizer T2 
in 

Kohavi, R., Sahami M., Error-Based and Entropy-Based Discretization of
Continuous Features. KDD-96.

Both are available at:
   http://robotics.stanford.edu/users/ronnyk/ronnyk-bib.html

--

   Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk)





[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0