[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re: DM: My introductionFrom: Daniel X. Pape Date: Tue, 29 Jul 1997 17:28:04 -0400 (EDT) > > What I use for these categorizations are modified (and optimized!) > > self-organizing maps (SOMs). I understand that many other people >are > > using SOMs for their datamining, so I would love to see a number >of > > discussions about people's experiences with them. > > Why do you use SOMs? Why not ordinary VQ? What properties of your > textual and image collections do you expect the dimensions of the > SOM grid to reveal? Well, what I specifically do in my research group is create SOMs in order to automatically categorize a data collection to allow the user to _browse_ the collection. Once the SOM is created, I am using it to create 2D and 3D interfaces to allow the user to graphically browse the collections. Another way I am using them is to automatically categorize a search result set for easy subsequent browsing - for example, if you do a search on AltaVista you might get 2000 results... a SOM could categorize the results so the user could easily pick the one or two hundred most relevant results. >From what I understand of vector quantization methods, there are two reasons why I don't use them: One, for the most part, they are _supervised learning_ methods. Since I am trying to categorize things automatically, I have to rely on _unsupervised_ methods. Obviously the user is not going to want to sit there and worry about training a VQ during a search session. Two, the VQ methods are meant for statistical classification or pattern recognition - not categorization. What I'm trying to do with the SOMs is to cluster and visualize the collections in a meaningful way. The VQ methods might eventually give more accurate classifications, but I am looking for fast (maybe rough) categorizations so the user can proceed with his task. > I have tried to use the WEBSOM application at > http://websom.hut.fi/websom/comp.ai.neural-nets/html/root.html > to search for articles in comp.ai.neural-nets, and I found it quite > useless. Dejanews works far better. The results you got were different because the two tools you used (WEBSOM and DejaNews) were designed for two different things. It depends on how you were searching. If you were searching comp.ai.neural-nets for a specific term or author or article, then of course DejaNews will be better - DejaNews uses very powerful and very fast search methods - but at their heart, they are just simple string matching methods. If you were trying to find a specific term or author or article with WEBSOM, you will have a hard time finding it. But if you want to BROWSE the collection of comp.ai.neural-nets to see what kind of articles are in it, you would have an easier time with the WEBSOM. The WEBSOM interface may be a bit confusing to the new user, so it might not be that easy - but you could never BROWSE the c.a.nn collection with DejaNews. Dan -- Daniel X. Pape Digital Library Research Program dpape@ncsa.uiuc.edu
|
MHonArc
2.2.0