Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: Re: information on diagnosis


From: Abraham Meidan
Date: Thu, 15 Apr 1999 03:52:23 -0400 (EDT)
Shlomo,

Many of WizWhy users use it for similar projects.

WizWhy analyses the data and by means of an association rules 
algorithm
(This algorithm reveals all the if-then rules that relate between the
dependent variable and the other fields in the data). The rules are 
then
applied to issue predictions for new cases.

There is no limit as to the number of records or fields. The analysis 
of 100
fields and 100000 will take a few hours.

> 1. Which algorithms used to detect the "interesting" (cause) X's ?

WizWhy contains a special algorithm for revealing interesting 
(=unexpected)
rules. These are rules having more than one condition, that are 
unlikely
given the one-condition rules. The calculation is based on conditional
probability.


> 2. How domain knowledge is integrated in the algorithms?

Since the domain knowledge is a set of if-then rules, and the data 
mining
algorithm reveals all the if-then rules, the domain knowledge is 
redundant.


> 3. How are results (which are always somewhat uncertain) shown to 
>the
novice user, who only wants to know the cause, and not any DM 
mumbo-jumbo ?

The prediction results include the predicted classification and its
probability.


You can  read about WizWhy and download a working demo from:
www.wizsoft.com. The demo is limited to 1,000 records. There is no
limitation as to the number of fields. You can therefore run the demo 
on
your data and check the results.

Regards,

Abraham.




>Hello,
>
>Is anyone out there using DM (or Multivariate Regression or whatever)
>in combination with some domain knowledge for diagnostic purposes ?
>
>I'm working on extrating "probable cause" from large (100 - 100000 
>records)
datasets.
>The target is to understand which of the many (~100) variables (X's) 
>affect
>the outcome of a (single known) target variable (Y).
>The algorithm is used to diagnose behavior of a large system, and
>tell the user what causes poor behavior.
>Some limited domain knowledge is available. It may be formulated as
>hierarchies and loose cause/effect relationships between the various
>X's.
>NOTE: Y and most X's are numeric. Some X's are categorical.
>
>I'm interested in any papers, books, algorithms, etc. about similar
>experiences.
>Especially:
> 1. Which algorithms used to detect the "interesting" (cause) X's ?
> 2. How domain knowledge is integrated in the algorithms?
> 3. How are results (which are always somewhat uncertain) shown to 
>the
>    novice user, who only wants to know the cause, and not any DM
mumbo-jumbo ?
>
>Thanks,
>
>Shlomo Urbach
>Neptune Software
>shlomo@neptune.co.il




[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0