Master Thesis in Intelligent Web and Information Systems for the year 2010/2011

Master Thesis in Intelligent Web and Information Systems

Research Focus

IWIS: Intelligent Web and Information systems is a young and dynamic group dealing with aspects of computation on the web. Currently active research areas of the group are influenced by world wide web research. The World Wide Web has had an extremely significant impact on access to information and information services. Web-based applications are influencing many domains such as business, commerce, banking and learning. The Web has simplified access to information and information services, enabling a variety of users with different backgrounds, social situations, and so on to participate. With the recent movement to Web 2.0 and collaborative co-construction of information, information grows on the World Wide Web more then ever before. World Wide Web therefore features new situation where traditional methods for information access, information processing, information retrieval, and information presentation hit their limits. The IWIS group studies and contribute with methods, tools and techniques for intelligent information systems and web technologies in the above mentioned context. As master students you will join this young team. More information can be found at iwis.cs.aau.dk

contact: Peter Dolog dolog@cs.aau.dk

Depending on your topic selection you can collaborate in addition with one or more of the IWIS group members!

Thesis Project Proposals

Topic of your own

Please feel free to suggest your own topic which you think we can supervise!

Real Time Collaborative Workgroup and Social Systems

Real time collaboration of distributed team is important in many areas such as working on designs, innovation, creativity (enterprices), wikis, games (entertainment), news, event detection (information). Editors and tools for such real time collaboration need to reflect situations which happen for example when people meet face to face (they draw collaboratively, write collaboratively and immediatelly see what the other team members did). Existing web based tools for collaboration such as graph editors, (semantic) wikis, blogs, and others need to be extended with these features and protocols to ensure consistency and other properties. Furthermore, for each situation requires different composition of these tools. RESTFUL web services and meshups such as the one from google web toolkit (GWT) or Liferay Portal tachnology can be explored for this purposes but also protocols and algorithms which make such a composition possible.

Service Oriented Architecture and Web Service Middleware

Service oriented computing is getting more attention especially due to the enterprise application integration. Number of company's services provided on-line through web services is growing rapidly and new horizons for integrating them into advance computing infrastructures and applications are emerging. Typical examples of currently available services are payment services, credit card handling services, insurance case handling services as well as others. The environments which should connect them also bring new challenges. The web services usually make the work of the company and access to the company's service more effective. On the other hand, many things which have been previously performed by persons in the company should now become part of the environment. For example, recovery from failures, transactions, coordination between various autonomous participants in web service conversations are just few areas of interest which are not fully solved in such environments.

In this project you can study:

  • how transactions concept can be implemented and improved in web service environment
  • how to design and program service conversations in a more domain oriented way suitable for average professional in the area
    service oriented architecture and middelware for rich interactive web applications
  • model driven design for web services
  • formal methods for protocols or methods in web service design
  • machine intelligence techniques for adaptive transactions or middleware
  • and others

References:

Using Social Tagging for Adaptation and Recommendation  purposes

Social tagging is the process that users bookmark the interested URLs to a public Web site and annotate them with a few free-text keywords. With social tagging, a user is able to explicitly expresses the own description on Web resources like images, videos, scientific articles, thus allowing other like-minded users to fulfill the request of finding the similar contents over the web.  Apparently Mining tag information reveals the topic-domain of users’ interests and significantly contributes in a profile construction process, in turn, for web personalization or recommendation. Nowadays Tagging analysis is emerging as a challenging topic in the research domains of intelligent web applications.

 

Intuitively tags could be considered as key terms with specific indications by users. Thus the conventional algorithms and techniques that used for information retrieval and text mining could be directly applied in processing tags such as using tf/idf expression to measure the similarity between tags and compute the relatedness of two documents or articles. However, due to the limitation of lexicon expression that known in information retrieval, using the tags or keywords alone to characterize the web resource is sometimes not sufficient enough to obtain the satisfactory results. Some efforts have been contributed from the perspective of natural language processing (NLP), machine learning (ML) and data mining (DM) to expand the modeling the tags with additional features, thus enriching the semantics the web resources for social tagging systems. For example [1] proposed a solution to improve the semantic expression of tags by using WorldNet for tag-based web recommendation systems. On the other hand, however, tag is one kind of so-called semantic collaborative information that in fact constitutes the uncontrolled and user defined vocabulary. Therefore it is expected that using machine learning approaches in tag mining will undoubtedly help to enrich the semantics of tags for better modeling social tagging system.

In this project you can therefore look at number of options:

  • Clustering: The use of clustering techniques can greatly improve the search performance in Social Bookmarking Systems, if tags with similar sense are grouped in the same cluster. The goal of this research is therefore to investigate the state of the art of clustering techniques and propose a novel approach to improve the search performance in Social Bookmarking Systems.
  • Nearest neighbour classification: The nearest neighburs and tags based approach process the data and reflect typical content of the document. Tags and neighbours classify document and return with high precsion results by a search engine. Regarding the highest ranked neighbours is influencing the tag and content meaning. The main aim of this proposal is to analyse the nearest tag neighbours effection in the tag based search.
  • Latent topic analysis: The above challenges can be also addressed via the approaches of latent topic analysis as an extension work of [1]. In particular you can use an archived Wikipedia article collection as a reference data source to train a latent topic model, with which each tag is mapped to a learned topic space with a certain probability distribution. With this additional feature of tag, a new similarity function between tags will be defined by combining the popularity, representativeness and the enriched semantic factors of tags into consideration. Eventually we compute the similarity of web resources based on the defined tag similarity for tag based web recommendation.
  • Other approaches of machine learning, data mining, similarity computation and so on

The research performed on those appraoches is mostly experimental on a specific data set to evaluate usually precision and recall on returned and computed results.

References:

  1. Frederico Durao, Peter Dolog , Extending a Hybrid Tag-Based Recommender System with Personalization, in 25th ACM Symposium on Applied Computing, ACM Press, Sierre, Switzerland (2010)
  2. Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering @ http://www.springerlink.com/content/e27481301647w348
  3. A Neighborhood Search Method for Link-Based Tag Clustering @ http://dx.doi.org/10.1007/978-3-642-03348-3_12
  4. Core-Tag Clustering for Web 2.0 Based on Multi-similarity Measurements @ http://dx.doi.org/10.1007/978-3-642-03996-6_21
  5. A Tag Clustering Method to Deal with Syntactic Variations on Collaborative Social Networks @ http://dx.doi.org/10.1007/978-3-642-02818-2_35
  6. J. Hu et al, Enhancing Text Clustering by Leveraging Wikipedia Semantics (http://doi.acm.org/10.1145/1390334.1390367), in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (2006), pp 179-184
  7.  X. Hu et al, Exploiting Wikipedia as External Knowledge for Document Clustering (http://doi.acm.org/10.1145/1557019.1557066), in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (2009), pp389-396
  8. A K-nearest neighbour method for classifying for web search results with data in Folkonomies: http://www.albertauyeung.com/papers/wi08-auyeung.pdf
  9. Neighbour based tag prediction: http://people.csail.mit.edu/pcm/papers/TagPrediction.pdf

 

Adaptation on the Web

Due to large amount of resources on the web and different devices which can access them, uniform access to them becomes a disadvatage when targeting diverce userse with different preferences, background, and so on. Adaptation and personalization on the web addresses these issues. Adaptation and personalization are computational processes with the aim to compute the most relevant information for a user or most suitable way to present it. Various techniques are used for doing so ranging from simple matchmaking, through clusterring and similarity computation, to machine learning based approaches. The input which they take into account range from selection of different pre-designed options to a resource which is rerendered on the fly when particular user accesses it. It is not only about information but also about display capabilities of mobile devices.

This project therefore aims at analysis, design, and implementation of adaptation strategies for a web site or a mobile platform which would improve current state of the art in literature or in technologies.

Recommended readings:

Peter Brusilovsky, Alfred Kobsa, Wolfgang Nejdl: The Adaptive Web, Methods and Strategies of Web Personalization Springer 2007