Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: Re: problem of sample size


From: Mac Johnstone
Date: Mon, 28 Aug 2000 17:32:34 -0700

Dear Vinnie:
There was quite a bit of discussion on this forum in May concerning a
similar problem.  Here is a method I have used successfully using
misclassification costs (see message from Earl S. Harris).  Get the ratio of
A to B.  In your case this is 9,500,000/200,000 = 47.5.  Round up to 48.
Now set classification costs such that it costs 48 to classify a B as an A
and 1 to classify an A as a B.  Of course, it costs 0 to classify an A as an
A and a B as a B.

The method above works with a random sample from the population which
contains representative numbers of both A's and B's.  As far as selecting a
sample size, I recommend the procedure in the book Data Preparation For Data
Mining by Dorian Pyle.

Regards,
Mac
----- Original Message -----
From: vinnie <ejan@otech.co.kr>
To: DM MailingList <datamine-l@nautilus-sys.com>
Sent: Friday, August 25, 2000 7:39 PM
Subject: DM: problem of sample size


 > Though It is a sort of traditional question, I wonder your method to =
 > deal with this kind of problem.
 >
 > The population size is about 9,500,000 (as record). There are two =
 > groups, A and B.
 > But Unfortunately, the size of A is 9,300,000 and that of B is 200,000.=20
 >
 > Of course, The size of B is sufficiently enough to make sample or =
 > analyze, But we have to balance the size of two groups. what is =
 > appropriate sample size for two groups, What kind of sampling methods =
 > could be applied?=20
 >
 > This problem is similar to the case of 1 bad guy and 99 good guys of 100 =
 > guys.
 >
 >




[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998-2000 Nautilus Systems, Inc. All Rights Reserved.
Email: firschng@nautilus-systems.com
Mail converted by MHonArc 2.2.0