An Enhanced Web Document Search Engine using a Semantic Network

Sang Thanh Thi Nguyen, Tuan Thanh Nguyen

Abstract


With the rapid advancement of ICT technology, the World Wide Web (referred to as the Web) has become the biggest information repository whose volume keeps growing on a daily basis. The challenge is how to find the most wanted information from the Web with a minimum effort. This paper presents a novel ontology-based framework for searching the related web pages to a given term within a few given specific websites. With this framework, a web crawler first learns the content of web pages within the given websites, then the topic modeller finds the relations between web pages and topics via key words found on the web pages using the Latent Dirichlet Allocation (LDA) technique. After that, the ontology builder establishes an ontology which is a semantic network of web pages based on the topic model. Finally, a reasoner can find the related web pages to a given term by making use of the ontology. The framework and related modelling techniques have been verified using a few test websites and the results convince its superiority over the existing web search tools.

Full Text:

PDF


DOI: http://dx.doi.org/10.21553/rev-jec.134

Copyright (c) 2016 REV Journal on Electronics and Communications


Copyright © 2011-2020
Radio and Electronics Association of Vietnam
All rights reserved