Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

RE: AW: DM: RE: Data Forms for Mining (Limit on variables)

From: osborn
Date: Thu, 25 May 2000 12:26:08 +1000

 > I am new to this. What is "VC-dimension"?

Vapnik-Chervonenkis Dimension. Eg, see "The Nature of Statistical
Learning Theory" by VN Vapnik, or "Statistical Learning Theory"
by Vapnik. Fairly heavy read...

" The VC dimension of a set of indicator functions Q(z,a), a in L,
is equal to the largest number h of vectors z1..zl that can be
separated into two different classes in all the 2^h possible ways
using this set of functions. " The notion here is being able to
"shatter" vectors into two sets in all possible way. Further
theorems determine the associated risk of misclassification
(or model complexity vs necessary _minimum_ data require
to build a model given a particular VC-dimension for a given
probability of misclassification).

In a practical situation there are issues of excluding classes of
models (weaker approximation), and being able to discriminate
between different classifications on the input data set (weaker

As other pointed out, in practical situations there are usually
a few (50?) variables which can do most of the work of building
a model that a client will find useful. The ART is identifying the
50. [And it's not always the same 50 for the whole of the input

If you know something about the model (hints), the situation
is changed. One thing you can "know" is that the functions
(of many input variables) should be fairly simple...


Dr Tom Osborn
Director of Modelling
Decision Support Consultants
Level 7, 1 York Street
phone:	+61 2 9252 0600
fax:	+61 2 9251 9894

[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Mail converted by MHonArc 2.2.0