Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

RE: DM: RE: Data Forms for Mining


From: T.S. Lim
Date: Tue, 30 May 2000 17:53:50 -0700
 >Date: Tue, 30 May 2000 10:57:07 -0700 (PDT)
 >From: Dan Steinberg <dstein@salford-systems.com>
 >To: "'datamine-l@nautilus-sys.com'" <datamine-l@nautilus-sys.com>
 >Subject: RE: DM: RE: Data Forms for Mining
 >Reply-To: datamine-l@nautilus-sys.com
 >
 >
 >On Fri, 26 May 2000, Collier, Ken wrote:
 >
 > > In your filtering suggestion are you saying that you generate multiple 
C5.0
 > > results from the entire database using different parameter settings, and
 > > then use the filtering node to isolate key features? Sounds like a 
variation
 > > on bundling. I'd like to know more. We are doing a lot with bundling,
 > > bagging, and boosting to improve our predictive accuracy.
 > >
 > > Ken Collier
 > > Senior Manager, Business Intelligence
 > > KPMG Consulting
 >
 >One method we have used for variable selection over the past 5 years is 
to grow
 >a large number of CART(R) trees using various settings on priors, splitting
 >rule, and test methods, and then provisionally eliminating variables that
 >have a
 >zero importance in all trees grown.  Zero importance means that the
 >variable did
 >not appear as either a primary splitter or a surrogate splitter at any 
node in
 >any tree.  Bootstrap resampling (done automatically under the bagging
 >option) can generate quite a bit of variation in tree structure; so can 
changes
 >in priors and costs.  If a variable cannot play a useful role in any tree 
under
 >a broad range of tree growing strategies there is little risk in eliminating
 >it.  This variable elimination method is easily automated via scripts and is
 >quite effective in radically reducing the number of candidate predictors.
 >
 >
 >  *---------------------------+---------------------------------*
 >  | Dan Steinberg             | FAX (619) 543 8888              |
 >  | Salford Systems           | VOICE (619) 543-8880            |
 >  | 8880 Rio San Diego Dr     |                                 |
 >  | San Diego, CA 92108       | http://www.salford-systems.com  |
 >  *-------------------------------------------------------------*


Do you grow the large tree and then prune it? Or, do you just grow the 
large tree without performing trees selection by cross-validation or 
pruning sample? Thanks.



--
T.S. Lim
tslim@recursive-partitioning.com
www.Recursive-Partitioning.com



------------------------------------------------------------
Get paid to write review! http://recursive-partitioning.epinions.com





[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Email: firschng@nautilus-systems.com
Mail converted by MHonArc 2.2.0