DM: Web Data Mining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: Web Data Mining

From: #DEVANSHU DHYANI#
Date: Mon, 30 Nov 1998 11:06:43 -0500 (EST)

I am involved in research aimed towards extending the utility of data 
mining
to semi-structured data such as WWW documents. In applying standard 
KDD
operators such as discovering association and characteristic rules,
classification, clustering to generalization based mining on an 
information
base derived from WWW documents we are faced with the following 
questions:

1. The information base itself must support storage, retrieval and 
querying
operations on semi-sturctured data. In our search for an appropriate 
data
model we have come across the XML/DOM (anticipated as the heir to 
HTML, as
markup for web docuemnts) and the somewhat similar Lore DBMS(by the 
Stanford
Database group). Are there any other suitable data models that are
appropriate for mining related tasks on semi-sturctured data?

2. Generalization based data-mining prerequisites the availability of 
domain
specific background knowledge. Although the use of concept hierarchies
fulfils this requirement by aiding attribute oriented induction and 
concept
tree ascension (in relational databases), their disadvantage lies in 
the
need to generate them manually for each domain. We are exploring the
possibility of adapting pre-existing, shared, reusable ontologies 
(such as
those in the Ontolingua system under the DARPA Knowlege Sharing 
effort) for
this purpose. Would the use of available ontologies improve the
generalization process especially because these may cover greater 
domain
knowledge (both in depth and extent of concepts) than indegineous 
concept
hierarchies?

   Thanks for your help.

Devanshu Dhyani
Undergraduate student,
Centre for Advanced Information Systems (CAIS),
Nanyang Technological University,
Singapore.

Prev by Date: DM: Data mining info for a project
Next by Date: DM: looking for public code for association rules
Prev by thread: DM: looking for public code for association rules
Next by thread: Re: DM: Web Data Mining
Index(es):
- Date
- Thread