[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re: DM: boltzmann machineFrom: Ronny Kohavi Date: Mon, 29 Sep 1997 21:11:56 -0400 (EDT) Ted> I have two unsupervised learning questions for the group. Ted> 1) There are several bias variance decompositions of Ted> classification error proposed for supervised classification Ted> methods. Has anyone applied such a decomposition to the Ted> classification errors of an unsupervised learner? I think the Ted> ideas of bias and variance still apply in the unsupervised case Ted> although variance may have a different meaning and I think we Ted> have to measure these quantities differently. Any thoughts or Ted> questions on this general area would be of great interest. I can Ted> provide more details of what I'm thinking of doing to anyone who Ted> might be interested. Many decompositions of error or other measures are possible, and I'm sure you can cook some up for several unsupervised learning. In fact, some unsupervised criteria are nicely decomposable. For example: inter- and intra- distance from centroids. The nice thing about the bias and variance decomposition for regression (and some of the proposed decompositions for classification) is that the two terms have a rather natural interpretation as being "biased" from the "optimal" on average, and "varing" around the average. There is a usual tradeoff between the two: when you increase the representation power of you hypothesis space, you can sometimes reduce the bias but you're likely to increase the variance. This is why many "surprising" phenomena can be explained using the decomposition. In unsupervised algorithms, there is no obvious measure of bias because there is no pre-specified target function to learn (if you don't know what the "right" answer is, how can you know that you're biased away from it.) Clustering is an optimization problem once you define the minimization criteria. Supervised learning is different: sometimes it's better to distance yourself from the minimum error on the trainin set to improve the generalization accuracy (e.g., decision tree pruning, weight sharing in NN, multiple nearest-neighbors). -- Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk) Engineering Manager, Analytical Data Mining. Silicon Graphics, Inc.
|
MHonArc
2.2.0