[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re: DM: Missing items in clusteringFrom: David L Dowe Date: Wed, 8 Oct 1997 22:54:48 -0400 (EDT) Dear DM people, particularly those interested in mixture modelling and its synonymns (clustering, numerical taxonomy, intrinsic classification, etc.), There are a couple of recurring themes in this mailing list: One theme is that people mail in from time to time asking whether anyone knows of a good program for mixture modelling (or its synonyms). Various people, usually including me, respond. A related sub-theme is that people mail in from time to time wishing to do mixture modelling with either multinomial (multi-category) variables and/or with missing data. If these issues do not interest you, please read no further. If these issues do interest you, possibly (e.g.) administration could store away the above two items as FAQs, as they certainly are asked frequently. Re mixture modelling programs, see (e.g.) - Murray Jorgensen's stuff (as in his e-mail below) and/or - my Snob program (dating back to 1968) with Chris Wallace http://www.cs.monash.edu.au/~dld/Snob.html and/or - my mixture modelling page, which is chock-a-block full of refs and links http://www.cs.monash.edu.au/~dld/mixture.modelling.page.html Re mixture modelling programs which deal (as in the request below from Raj Kumaralingam) with missing data or with multinomial data, two of the not many programs for doing this listed in my mixtures page http://www.cs.monash.edu.au/~dld/mixture.modelling.page.html are indeed Snob (Chris Wallace and David Dowe) and Lyn Hunt and Murray Jorgensen's MULTIMIX. A third program for dealing with discrete data (but perhaps not for missing data) is Marty Puterman's at given at http://markov.commerce.ubc.ca/marty/ . At this point, usually after Murray and I reply to the DM list, this topic then goes quiet (till it is next raised). Is anyone else out there aware of other mixture modelling programs for multinomial data (other than Marty Puterman's) or missing data? Also, re Snob and MULTIMIX (and Marty Puterman's work), anyone out there want to do an empirical study and publish (and report) it? Regards, and earlier e-mail is appended below. - David. Dr. David Dowe, Dept of Computer Science, Monash University, Clayton, Victoria 3168, Australia dld@cs.monash.edu.au Fax:+61 3 9905-5146 http://www.cs.monash.edu.au/~dld/ http://www.cs.monash.edu.au/~dld/mixture.modelling.page.html http://www.cs.monash.edu.au/~dld/Snob.html > From owner-datamine-l@nessie.crosslink.net Wed Oct 8 10:08:54 1997 > Date: Wed, 08 Oct 1997 12:14:25 +1300 > To: "'datamine-l@nautilus-sys.com'" <datamine-l@nautilus-sys.com> > From: Murray Jorgensen <maj@waikato.ac.nz> > Subject: Re: DM: Missing items in clustering! > > My collegue Lyn Hunt has written a clustering program called >MULTIMIX based > on ideas related to what is often called "Naive Bayes" using the EM > algorithm for maximum likelihood estimation with missing >information. The > information that is always missing is the association of objects to > clusters, but in addition values of the measured variables may be >missing. > > Distance matrix based clustering has an inherent problem with >missing > observations because there is no underlying statistical model. One >ad hoc > approach might be to regress indivudual variables against others >and use > fitted values to impute the values of missing data. > > At 14:32 7/10/97 -0500, you wrote: > >Hi all, > >Does anyone have any pointers to how to handle missing > >items in a clustering context (I'm currently using Euclidean metric > >based clustering). > > > >Thanks in advance > >Raj > > > > > Dr Murray Jorgensen maj@waikato.ac.nz Phone +64-7 838 >4773 > Department of Statistics home phone 856 6705; Fax 838 >4666 > University of Waikato >http://www.cs.waikato.ac.nz/stats/Staff/maj.html > Hamilton, New Zealand **** Editor: New Zealand Statistician >****
|
MHonArc
2.2.0