[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
DM: missing attribute values in classification treesFrom: Tjen-Sien Lim Date: Thu, 29 Oct 1998 13:56:30 -0500 (EST) Hi, I'd like to get some advice from those of you who have analyzed datasets with missing attribute values in supervised learning. What's the "typical" proportion of cases with missing values? How big the proportion has to be before it presents problems? We're conducting a project comparing classification trees classifiers on datasets with missing values. We'd like to simulate missing-at-random on datasets that contain no missing values to increase the number of datasets. Should we simulate 5%, 10%, 20%, or 30% missing at random? Any preferred way to induce those missing values? Thanks in advance for any advice/pointers/suggestions. -- Tjen-Sien Lim (608) 262-8181 Ph.D. candidate limt@stat.wisc.edu Dept. of Statistics http://www.stat.wisc.edu/~limt Univ. of Wisconsin-Madison 1210 West Dayton Street Madison, WI 53706
|
MHonArc
2.2.0