Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: RE: Genetic Algorithms


From: Min Pei
Date: Thu, 14 Aug 1997 12:49:26 -0400 (EDT)
----- Forwarded message from Min Pei -----

>From pei@egr.msu.edu Thu Aug 14 12:44 EDT 1997
Return-Path: <pei@egr.msu.edu>
Received: from speed by egr.msu.edu (SMI-8.6/1.34)
        id MAA18215; Thu, 14 Aug 1997 12:44:45 -0400
Received: by speed (SMI-8.6/SMI-SVR4)
        id MAA04260; Thu, 14 Aug 1997 12:29:57 -0400
From: pei@egr.msu.edu (Min Pei)
Message-Id: <199708141629.MAA04260@speed>
Subject: Re: DM: RE: Genetic Algorithms
In-Reply-To: <25169.9708121712@fat-controller.cs.bham.ac.uk> from 
"A.N.Pryke@cs.bham.ac.uk" at "Aug 12, 97 06:12:48 pm"
To: A.N.Pryke@cs.bham.ac.uk
Date: Thu, 14 Aug 1997 12:29:56 -0400 (EDT)
Cc: pei@egr.msu.edu (Min Pei)
X-Mailer: ELM [version 2.4ME+ PL31 (25)]
Content-Length: 16252

  Recent years I have been doing some research on Data Mining using
  Genetic Algorithms. If you are interested in this subject, you can
  visit my web site to pick up and refer to my recent papers.
    The URL address is: 
     "http://www.egr.msu.edu/~pei/paper/research_index.html"
   Here is the papers:

 GApaper95-01: M. Pei, Y. Ding, W.F. Punch and E.D. Goodman,
 "Classification and Feature Extraction of High-Dimensionality Binary
  Patterns using a GA to Evolve Rules",
  Technical Report Jan. 95. 

 GApaper95-02: M. Pei, E.D. Goodman, W.F. Punch and Y. Ding,
 "Genetic Algorithms For Classification and Feature Extraction",
 1995 Annual Meeting, Classification Society of North America, June 
1995. 

 GApaper96-01: M. Pei, E.D. Goodman, W.F. Punch,
 "Pattern Discovery From Data Using Genetic Algorithms",
 First Pacific-Asia Conference on Knowledge Discovery & Data Mining, 
Feb. 1997. 

 GApaper97-01: M. Pei, E.D. Goodman, W.F. Punch,
 "Feature Extraction Using Genetic Algorithms", 
  Technical Report June. 97.

 Data Mining using Genetic Algorithms has been already put in to 
practice.
 I learned from the book "Data Mining". A data mining company Sylogic 
complete
 a project for KML airline on pilot planning successfully and other 
projects 
 too. And Thinking Machine company use GA called stargene to learn to 
optimize
 his neual network. Also some another stories you can find from Web 
site.
  I belive there is a great pontential for usinig GA on data mining. 
Of cource
 GAs have their own advantage and disadvantage. Choosing right tools 
suitable 
 for your problem is important.
 important 
 ---Pei---

  
****************************************************************************
  Min Pei
  Visiting Professor and Researcher
  Case Center for Computer-Aided Engineering and Manufacturing
  Michican State University
  Professor
  Beijing Union University

  Snail mail:
   Case Center for Computer-Aided Engineering and Manufacturing
   2325 Engneering Bldg. 
   Michican State University
   East Lansing, MI 48824      
    U.S.A.               
  Home address:
   1953 N. Harrison Ave. East Lansing MI 48823 U.S.A.

   Phone: (517)353-4973(O), (517)333-2666(H)
     FAX: (517)432-0704
  E-mail addr: pei@egr.msu.edu
  http://www.egr.msu.edu/~pei
  
****************************************************************************


> 
> Sarab <ss.anand@ulst.ac.uk> wrote:
> 
> > A fairly comprehensive GA related web site is:
> > http://www.shef.ac.uk/~gaipp/galinks.html However, these are not
> > necessarily Data Mining. The only people I am aware of that are
> > working in Data Mining using GAs are Prof. Vic Rayward-Smith of
> > Univ. of East Anglia and Quadstone Ltd. in Edinburgh.
> 
> 
> My work involves a flexible search engine for data mining, which can
> operate as a symbolic GA.
> 
> The system discovers classification, association and cluster
> rules. Unfortunately, I don't have anything online on the GA side at
> the moment. Some  information on the visualisation side is
> at: http://www.cs.bham.ac.uk/~anp/haiku - this is a bit out of date,
> but the pictures are still pretty!
> 
> 
> Other systems of relevance are (with apologies to non-latex 
>speakers):
> 
> GABIL \cite{dejong:learning-concept:91} learns classification rules
>  from examples with symbolic attributes.
> 
> HDBPCS (High Dimensionality Binary Pattern Classification System)
> \cite{pei.ea:classification-feature:95}- a system for
>  discovering classification rules and subsequent feature extraction
>  from binary data.
> 
> Beagle uses a genetic algorithm on symbolic rules to generate
> classification rules \cite{forsyth:inductive-learning:89} of the 
>form:
> IF $((5*pressure) > temperature)$ THEN item is in class $C$ .
> 
> COGIN (COverage-based Genetic INduction)
> \cite{greene.ea:cogin-symbolic:92} is a GA-based system for the
> induction of classification rules.
> 
> (SIA) \cite{venturini:sia-supervised:93} learns conjunctive
> classification rules from pre-classified examples. SIA is similar to
> the AQ algorithm \cite{michalski.ea:multi-purpose-incremental:86} in
> that it generates new rules using uncovered examples as a seed.
> 
> SIA01 \cite{augier.ea:learning-first:95} learns First Order Logic
> (FOL) rules for binary classification
> 
> A paper entitled ``Co-operation through Hierarchical Competition in
> Genetic Data Mining''\cite{radcliffe.ea:co-operation-throught:94},
> Radcliffe and Surry discuss a two-level hierarchical approach which
> finds rule sets with good coverage of the data. The low-level GA is
> used to discover individual rules. The high-level GA is then applied
> to create rulesets from these.
> 
> GA-Miner \cite{flockhart.ea:genetic-algorithm-based:96} uses a 
>genetic
> algorithm to discover three types of pattern: predictive rules with
> expressions on both LHS and RHS; ``distribution shift patterns'' 
>which
> indicate that a particular attribute has a different distribution 
>in a
> subset of the data; and ``correlation patterns'' which assert that 
>two
> attributes are correlated in a particular subset.
> 
> I believe Ultragem also have a GA based data mining system.
> 
> 
> If anyone else is working in this field and knows of other relevent
> systems, please email me (or the group) and tell me about them. 
> 
> Thanks, 
> 
>   Andy
> 
> 
> References
> ----------
> 
> 
> 
> 
> @InProceedings{augier.ea:learning-first:95,
>   author =       "S. Augier and G. Venturini and Y. Kodratoff",
>   title =        "Learning First Order Logic Rules with a Genetic
>                  Algorithm",
>   booktitle =    "Proceedings of the First International Conference 
>on
>                  Knowledge Discovery and Data Mining (KDD'95)",
>   year =         "1995",
>   pages =        "21--26",
> 
> }
> 
> @InProceedings{bala.ea:using-genetic:91,
>   author =       "J. Bala and K. DeJong and P. Pachowicz",
>   title =        "Using Genetic Algorithms to improve the 
>performance of
>                  classification rules produced by symbolic inductive
>                  methods",
>   editor =       "Z. W. Ras and M. Zemankova",
>   pages =        "286--295",
>   booktitle =    "Proceedings of 6th International Symposium
>                  Methodologies for Intelligent Systems ISMIS'91",
>   year =         "1991",
>   publisher =    "Springer-Verlag, Berlin, Germany",
>   address =      "Charlotte, NC",
>   month =        "16-19 " # oct,
> }
> 
> @Article{bala.ea:using-genetic:91a,
>   key_modifier = "a",
>   author =       "J. Bala and K. DeJong and P. Pachowicz",
>   title =        "Using genetic algorithms to improve the 
>performance of
>                  classification rules produced by symbolic inductive
>                  method",
>   journal =      "Lecture Notes in Computer Science",
>   volume =       "542",
>   pages =        "286--295",
>   year =         "1991",
>   ISSN =         "0302-9743",
> }
> 
> @InProceedings{dejong:learning-concept:91,
>   author =       "W. M. Spears K. A. DeJong",
>   title =        "Learning Concept Classification Rules Using 
>Genetic
>                  Algorithms",
>   year =         "1991",
>   booktitle =    "Proceedings of the International Joint Conference 
>on
>                  Artificial Intelligence",
>   address =      "Sidney, Australia",
>   pages =        "651--656",
>   keywords =     "GABIL, pittsburgh approach, binary 
>representation",
> }
> 
> @InProceedings{flockhart.ea:genetic-algorithm-based:96,
>   author =       "I. W. Flockhart and N. J. Radcliffe",
>   title =        "A Genetic Algorithm-Based Approach to Data 
>Mining",
>   booktitle =    "The Second International Conference on Knowledge
>                  Discovery and Data Mining (KDD-96)",
>   editor =       "Evangelos Simoudis and Jia Wei Han and Usama 
>Fayyad",
>   year =         "1996",
>   month =        aug # " 2-4",
>   keywords =     "GA-Miner, Genetic Algorithms, Quadstone",
>   address =      "Portland, Oregon, USA",
>   publisher =    "AAAI",
>   annote =       "KDD-96
>                  
>http://www.aaai.org:80/Press/Proceedings/KDD/1996/kdd-96.html",
> }
> 
> @InProceedings{greene.ea:cogin-symbolic:92,
>   author =       "D. P. Greene and S. F. Smith",
>   title =        "{COGIN}: Symbolic Induction with Genetic 
>Algorithms",
>   year =         "1992",
>   booktitle =    "Proc.\ of AAAI-92",
>   pages =        "111--116",
>   keywords =     "GA",
> }
> 
> @Article{greene.ea:competition-based-induction:93,
>   author =       "D. P. Greene and S. F. Smith",
>   address =      "Carnegie Mellon Univ, Sch Comp Sci, Inst Robot,
>                  Pittsburgh, Pa, 15213",
>   title =        "Competition-based induction of decision-models 
>from
>                  examples",
>   journal =      "Machine Learning",
>   year =         "1993",
>   volume =       "13",
>   issue =        "2-3",
>   pages =        "229--257",
>   abstract =     "Symbolic induction is a promising approach to
>                  constructing decision models by extracting 
>regularities
>                  from a data set of examples. The predominant type 
>of
>                  model is a classification rule (or set of rules) 
>that
>                  maps a set of relevant environmental features into
>                  specific categories or values. Classifying loan 
>risk
>                  based on borrower profiles, consumer choice from
>                  purchase data, or supply levels based on operating
>                  conditions are all examples of this type of model-
>                  building task. Although current inductive 
>approaches,
>                  such as ID3 and CN2, perform well on certain 
>problems,
>                  their potential is limited by the incremental 
>nature of
>                  their search. Genetic algorithms (GA) have shown 
>great
>                  promise on complex search domains, and hence 
>suggest a
>                  means for overcoming these limitations. However,
>                  effective use of genetic search in this context
>                  requires a framework that promotes the fundamental
>                  model-building objectives of predictive accuracy 
>and
>                  model simplicity. In this article we describe 
>COGIN, a
>                  GA-based inductive system that exploits the 
>conventions
>                  of induction from examples to provide this 
>framework.
>                  The novelty of COGIN lies in its use of training 
>set
>                  coverage to simultaneously promote competition in
>                  various classification niches within the model and
>                  constrain overall model complexity. Experimental
>                  comparisons with NewID and CN2 provide evidence of 
>the
>                  effectiveness of the COGIN framework and the 
>viability
>                  of the GA approach.",
>   keywords =     "GENETIC ALGORITHMS, SYMBOLIC INDUCTION, CONCEPT
>                  LEARNING",
> }
> 
> @Article{janikow:knowledge-intensive-genetic:93,
>   author =       "C. Z. Janikow",
>   address =      "Umsl, Dept Math \& Comp Sci, St Louis, Mo, 63121",
>   title =        "A knowledge-intensive genetic algorithm for 
>supervised
>                  learning",
>   journal =      "Machine Learning",
>   year =         "1993",
>   volume =       "13",
>   issue =        "2-3",
>   pages =        "189--228",
>   abstract =     "Supervised learning in attribute-based spaces is 
>one
>                  of the most popular machine learning problems 
>studied
>                  and, consequently, has attracted considerable 
>attention
>                  of the genetic algorithm community. The full-memory
>                  approach developed here uses the same high-level
>                  descriptive language that is used in rule-based
>                  systems. This allows for an easy utilization of
>                  inference rules of the well-known inductive 
>learning
>                  methodology, which replace the traditional domain-
>                  independent operators and make the search
>                  task-specific. Moreover, a closer relationship 
>between
>                  the underlying task and the processing mechanisms
>                  provides a setting for an application of more 
>powerful
>                  task-specific heuristics. Initial results obtained 
>with
>                  a prototype implementation for the simplest case of
>                  single concepts indicate that genetic algorithms 
>can be
>                  effectively used to process high-level concepts and
>                  incorporate task-specific knowledge. The method of
>                  abstracting the genetic algorithm to the problem 
>level,
>                  described here for the supervised inductive 
>learning,
>                  can be also extended to other domains and tasks, 
>since
>                  it provides a framework for combining recently 
>popular
>                  genetic algorithm methods with traditional problem-
>                  solving methodologies. Moreover, in this particular
>                  case, it provides a very powerful tool enabling 
>study
>                  of the widely accepted but not so well understood
>                  inductive learning methodology.",
>   keywords =     "GENETIC ALGORITHMS, MACHINE LEARNING, SYMBOLIC
>                  LEARNING, SUPERVISED LEARNING",
> }
> 
> @TechReport{pei.ea:classification-feature:95,
>   author =       "Min Pei and Ying Ding and William F Punch(III) and
>                  Erik D Goodman",
>   title =        "Classification and Feature Extraction of
>                  High-Dimensionality Binary Patterns using a {GA} to
>                  Evolve Rule",
>   institution =  "Michigan State University",
>   year =         "1995",
>   annote =       "Uses std GA to develop classifier system type 
>rules",
> }
> 
> @TechReport{radcliffe.ea:co-operation-throught:94,
>   author =       "N. J. Radcliffe and P. D. Surry",
>   title =        "Co-operation throught Hierarchical Competition in
>                  Genetic Data Mining",
>   institution =  "Edinburgh Parallel Computing Centre",
>   type =         "Technical Report",
>   number =       "EPCC-TR94-09",
>   year =         "1994",
> }
> 
> @InProceedings{vafaie.ea:improving-performance:91,
>   author =       "H. Vafaie and K. DeJong",
>   title =        "Improving the performance of a rule induction 
>system
>                  using genetic algorithms",
>   editor =       "R. S. Michalski and G. Tecuci",
>   pages =        "305--315",
>   booktitle =    "Proceedings of the First International Workshop on
>                  Multistrategy Learning MSL-91",
>   year =         "1991",
>   organization = "Center for Artificial Intelligence, Fairfax, VA",
>   address =      "Harpers Ferry, WV",
>   month =        "7-9 " # nov,
> }
> 
> @InProceedings{venturini:sia-supervised:93,
>   author =       "Gilles Venturini",
>   title =        "{SIA}: {A} Supervised Induction Algorithm with 
>Genetic
>                  Search for Learning Attributes based Concepts",
>   booktitle =    "European Conference on Machine Learning 
>(ECML-93)",
>   publisher =    "Springer-Verlag",
>   year =         "1993",
>   keywords =     "GA, Rules, Induction, Comparison",
> }
> 
> 
> @InCollection{forsyth:inductive-learning:89,
>   author =       "Richard Forsyth",
>   title =        "Inductive Learning for Expert Systems",
>   booktitle =    "Expert Systems Principles and Case Studies",
>   publisher =    "Chapman and Hall, New York",
>   year =         "1989",
> }
> 
> 
> @InProceedings{michalski.ea:multi-purpose-incremental:86,
>   author =       "Ryszard S. Michalski and Igor Mozetic and Jiarong 
>Hong
>                  and Nada Lavrac",
>   title =        "The multi-purpose incremental learning system 
>{AQ15}
>                  and its testing application to three medical 
>domains",
>   booktitle =    "Proceedings of the 5th national conference on
>                  Artificial Intelligence",
>   pages =        "1041--1045",
>   address =      "Philadelphia",
>   year =         "1986",
> }
> 
> 
> --
>    Andy Pryke, Research Student, Computer Science, Birmingham 
>University
> Data Mining Information - 
>http://www.cs.bham.ac.uk/~anp/TheDataMine.html 

----- End of forwarded message from Min Pei -----



[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0