[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
DM: Web Data MiningFrom: #DEVANSHU DHYANI# Date: Mon, 30 Nov 1998 11:06:43 -0500 (EST) I am involved in research aimed towards extending the utility of data mining to semi-structured data such as WWW documents. In applying standard KDD operators such as discovering association and characteristic rules, classification, clustering to generalization based mining on an information base derived from WWW documents we are faced with the following questions: 1. The information base itself must support storage, retrieval and querying operations on semi-sturctured data. In our search for an appropriate data model we have come across the XML/DOM (anticipated as the heir to HTML, as markup for web docuemnts) and the somewhat similar Lore DBMS(by the Stanford Database group). Are there any other suitable data models that are appropriate for mining related tasks on semi-sturctured data? 2. Generalization based data-mining prerequisites the availability of domain specific background knowledge. Although the use of concept hierarchies fulfils this requirement by aiding attribute oriented induction and concept tree ascension (in relational databases), their disadvantage lies in the need to generate them manually for each domain. We are exploring the possibility of adapting pre-existing, shared, reusable ontologies (such as those in the Ontolingua system under the DARPA Knowlege Sharing effort) for this purpose. Would the use of available ontologies improve the generalization process especially because these may cover greater domain knowledge (both in depth and extent of concepts) than indegineous concept hierarchies? Thanks for your help. Devanshu Dhyani Undergraduate student, Centre for Advanced Information Systems (CAIS), Nanyang Technological University, Singapore.
|
MHonArc
2.2.0