Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: Web Data Mining


From: David L Dowe
Date: Mon, 30 Nov 1998 19:45:12 -0500 (EST)
   Dear Devanshu,

      The papers below might be of interest.

David.

Dr. David Dowe, Senior Lecturer, School of Computer Science and 
Software Eng.,
Monash University, Clayton, Victoria 3168, Australia     
dld@cs.monash.edu.au
Tel:+61 3 9905-5776  Fax:+61 3 9905-5146   
http://www.csse.monash.edu.au/~dld/


D L Dowe, L Allison and G Pringle (1998).  The hunter and the hunted -
      modelling the relationship between Web pages and search engines,
      pp380-382, 2nd Pacific-Asia Conference on Knowledge Discovery 
and Data
      Mining (PAKDD98), Lecture Notes in Artificial Intelligence 
(LNAI) 1394,
      Melbourne, Australia, April 1998

G Pringle, L Allison and D L Dowe (1998).  What is a tall poppy among 
Web
      pages?, in H Ashman and P Thistlewaite (eds.), Computer 
Networks and ISDN
      Systems (Journal): Proceedings of the 7th International World 
Wide Web
      Conference, Brisbane, Australia, 14-18 April 1998, Elsevier 
Science BV,
      Netherlands, ISBN: 0169-7552-98, Vol. 30: 1-7: pp369-377

> From owner-datamine-l@nessie.crosslink.net  Tue Dec  1 03:36:18 1998
> From: #DEVANSHU DHYANI# <SA1377897@ntu.edu.sg>
> To: "'datamine-l@nautilus-sys.com'" <datamine-l@nautilus-sys.com>
> Subject: DM: Web Data Mining
> Date: Mon, 30 Nov 1998 23:37:01 +0800
> 
> I am involved in research aimed towards extending the utility of 
>data mining
> to semi-structured data such as WWW documents. In applying standard 
>KDD
> operators such as discovering association and characteristic rules,
> classification, clustering to generalization based mining on an 
>information
> base derived from WWW documents we are faced with the following 
>questions:
> 
> 1. The information base itself must support storage, retrieval and 
>querying
> operations on semi-sturctured data. In our search for an 
>appropriate data
> model we have come across the XML/DOM (anticipated as the heir to 
>HTML, as
> markup for web docuemnts) and the somewhat similar Lore DBMS(by the 
>Stanford
> Database group). Are there any other suitable data models that are
> appropriate for mining related tasks on semi-sturctured data?
> 
> 2. Generalization based data-mining prerequisites the availability 
>of domain
> specific background knowledge. Although the use of concept 
>hierarchies
> fulfils this requirement by aiding attribute oriented induction and 
>concept
> tree ascension (in relational databases), their disadvantage lies 
>in the
> need to generate them manually for each domain. We are exploring the
> possibility of adapting pre-existing, shared, reusable ontologies 
>(such as
> those in the Ontolingua system under the DARPA Knowlege Sharing 
>effort) for
> this purpose. Would the use of available ontologies improve the
> generalization process especially because these may cover greater 
>domain
> knowledge (both in depth and extent of concepts) than indegineous 
>concept
> hierarchies?
> 
>    Thanks for your help.
> 
> Devanshu Dhyani
> Undergraduate student,
> Centre for Advanced Information Systems (CAIS),
> Nanyang Technological University,
> Singapore.



[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0