Web Document Modeling

Rong Pan's picture
Publication Author: 
Alessandro Micarelli
Filippo Sciarrone
Mauro Marinilli
Host Title: 
The Adaptive Web
Series: 
Lecture Notes in Computer Science
Volume: 
4321
Number: 
978

A very common issue of adaptive Web-Based systems is the modeling of documents. Such documents represent domain-specific information for a number of purposes. Application areas such as Information Search, Focused Crawling and Content Adaptation (among many others) benefit from several techniques and approaches to model documents effectively. For example, a document usually needs preliminary processing in order to obtain the relevant information in an effective and useful format, so as to be automatically processed by the system. The objective of this chapter is to support other chapters, providing a basic overview of the most common and useful techniques and approaches related with document modeling. This chapter describes high-level techniques to model Web documents, such as the Vector Space Model and a number of AI approaches, such as Semantic Networks, Neural Networks and Bayesian Networks. This chapter is not meant to act as a substitute of more comprehensive discussions about the topics presented. Rather, it provides a brief and informal introduction to the main concepts of document modeling, also focusing on the systems that are presented in the rest of the book as concrete examples of the related concepts.

Month: 
May
Event Location: 
The Adaptive Web
Year: 
2007
Type of paper: 
Article in Proceedings (WS or C)
Publisher Address: 
Berlin, Germany
Publisher: 
The Adaptive Web