Automatic Ontology Construction

LU Qin
Dept. of Computing
The Hong Kong Polytechnic University
Tel. 2766 7247, Fax. 2774 0842,
email: csluqin@comp.polyu.edu.hk

Abstract:

An ontology representing a domain specific knowledge space is constructed through domain specific terms. The concepts behind these terms are described by certain attributes, and the relations among the different concepts. This talk will first give an overview on the various ontology construction methods. Automatic ontology construction methods can be classified in top-down approaches and bottom-up approaches. In the top-down approach, some upper-level ontology is given and algorithms are developed to expand the ontology from the most general concepts downwards, to reach the leaf nodes where instances of concepts can be attached as leaves. By contrast, in the bottom-up approach, some domain corpus is employed to extract concepts, attributes, and relations without prior ontological knowledge. To generate a comprehensive ontology, the required domain corpus must have a good coverage of domain knowledge. Thus, it is vitally important to have quality corpus available for information extraction. This talk with focus on how to explore Wikipedia as a semantic rich corpus for ontology construction. The exploration includes how to identify domain specific information, how to identify concept terms and their attributes, and finally how to identify the taxonomic relationships on the identified terms.

About the Speaker:

Prof. Lu.s research work is mostly focused on using natural language processing method on information extraction and text mining. She has conducted extensive work on Chinese collocation extraction, terminology extraction, and ontology construction. Her research has received over 2million funding from the CERG and over 10million funding from ITF. Prof. Lu received her B.S. in E.E. from Beijing Normal University, M.S. and Ph.D. in computer science from the University of Illinois at Urbana-Champaign. She is currently the chief editor of IJCPOL.