[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
RE: DM: RE: Data Forms for MiningFrom: T.S. Lim Date: Tue, 30 May 2000 17:53:50 -0700 >Date: Tue, 30 May 2000 10:57:07 -0700 (PDT) >From: Dan Steinberg <dstein@salford-systems.com> >To: "'datamine-l@nautilus-sys.com'" <datamine-l@nautilus-sys.com> >Subject: RE: DM: RE: Data Forms for Mining >Reply-To: datamine-l@nautilus-sys.com > > >On Fri, 26 May 2000, Collier, Ken wrote: > > > In your filtering suggestion are you saying that you generate multiple C5.0 > > results from the entire database using different parameter settings, and > > then use the filtering node to isolate key features? Sounds like a variation > > on bundling. I'd like to know more. We are doing a lot with bundling, > > bagging, and boosting to improve our predictive accuracy. > > > > Ken Collier > > Senior Manager, Business Intelligence > > KPMG Consulting > >One method we have used for variable selection over the past 5 years is to grow >a large number of CART(R) trees using various settings on priors, splitting >rule, and test methods, and then provisionally eliminating variables that >have a >zero importance in all trees grown. Zero importance means that the >variable did >not appear as either a primary splitter or a surrogate splitter at any node in >any tree. Bootstrap resampling (done automatically under the bagging >option) can generate quite a bit of variation in tree structure; so can changes >in priors and costs. If a variable cannot play a useful role in any tree under >a broad range of tree growing strategies there is little risk in eliminating >it. This variable elimination method is easily automated via scripts and is >quite effective in radically reducing the number of candidate predictors. > > > *---------------------------+---------------------------------* > | Dan Steinberg | FAX (619) 543 8888 | > | Salford Systems | VOICE (619) 543-8880 | > | 8880 Rio San Diego Dr | | > | San Diego, CA 92108 | http://www.salford-systems.com | > *-------------------------------------------------------------* Do you grow the large tree and then prune it? Or, do you just grow the large tree without performing trees selection by cross-validation or pruning sample? Thanks. -- T.S. Lim tslim@recursive-partitioning.com www.Recursive-Partitioning.com ------------------------------------------------------------ Get paid to write review! http://recursive-partitioning.epinions.com
|
MHonArc
2.2.0