Master Thesis 2011/2012

Master Thesis in Intelligent Web and Information Systems

Research Focus

IWIS: Intelligent Web and Information systems is a young and dynamic group dealing with aspects of computation on the web. Currently active research areas of the group are influenced by world wide web research. The World Wide Web has had an extremely significant impact on access to information and information services. Web-based applications are influencing many domains such as business, commerce, banking and learning. The Web has simplified access to information and information services, enabling a variety of users with different backgrounds, social situations, and so on to participate. With the recent movement to Web 2.0 and collaborative co-construction of information, information grows on the World Wide Web more then ever before. World Wide Web therefore features new situation where traditional methods for information access, information processing, information retrieval, and information presentation hit their limits. The IWIS group studies and contribute with methods, tools and techniques for intelligent information systems and web technologies in the above mentioned context. As master students you will join this young team. More information can be found at iwis.cs.aau.dk

contact: Peter Dolog (dolog@cs.aau.dk), Ricardo Lage (ricardol@cs.aau.dk) and Fred Durao(fred@cs.aau.dk)

Thesis Project Proposals

 

Analysis of Social Network Structure to Improve Recommendations to Groups

One of the challenges of making recommendations to groups of individuals is to maximize the overall group satisfaction meanwhile keep individual dissatisfactions low. A group recommender system needs to infer the group preferences and adapt to the evolving preferences of the members. In this proposal, we aim at addressing this challenge by investigating the social network structure of recommendation recipients. For instance, a recommender could perform better if news articles are recommended to individuals that already know to each other, or, a recommender could perform better if news articles are recommended only to individuals with high number of social ties. All of these hypotheses require analysis of the social network structure.

Suggestion of group recommender system for recent Web2.0 applications. 

           YouTube: recommend music to groups;

           Twitter: recommend article news to followers of an account. The followers can be seen as a group.

References:

  1. L. Boratto and S. Carta. State-of-the-art in group recommendation and new approaches for automatic identiļ¬cation of groups. In A. Soro, E. Vargiu, G. Armano, and G. Paddeu, editors, Information Retrieval and Mining in Distributed Environments, volume 324 of Studies in Computational Intelligence, pages 1–20. Springer Berlin / Heidelberg, 2011.
  2. A. Jameson. More than the sum of its members: challenges for group recommender systems. In AVI ’04: Proceedings of the working conference on Advanced visual interfaces, pages
    48–54, New York, NY, USA, 2004. ACM.
  3. A. Crossen, J. Budzik, and K. J. Hammond. Flytrap: intelligent group music recommendation. In IUI ’02: Proceedings of the 7th international conference on Intelligent user inter-
    faces, pages 184–185, New York, NY, USA, 2002. ACM.
  4. J. Mastho and A. Gatt. In pursuit of satisfaction and the prevention of embarrassment: aff ective state in group recommender systems. UserModeling and User-Adapted Interaction,
    16(3-4):281–319, 2006.

Exploiting Tag Cloud Organization for Intelligent Browsing

A tag cloud is a visual depiction of user-generated tags, or simply the word content of a site, typically used to describe the content of web sites. The tags are usually hyperlinks that lead to a collection of items that are associated with a tag. Tag clouds have been popularized by social sites, such as Flickr, Technorati and del.icio.us.

In general, tag clouds are listed alphabetically and in different color or font size based on their popularity and a predefined number of them is shown to the user. In this proposal, we aim at studying means of generate quality tag clouds that assist users to navigate in the web site. The idea is to proposed novel organization of a tag cloud that not simply relies on popularity of tags. A number of techniques can be utilized since similarities between tags and clustering methods. A user study comparing the quality of tag clouds is suggested so that the work is evaluated in a real case scenario.

References:

  1. J. Schrammel,M. Leitner, andM. Tscheligi. Semantically structured tag clouds: an empirical evaluation of clustered presentation approaches. In Proceedings of the 27th international
    conference on Human factors in computing systems, CHI ’09, pages 2037–2040, New York, NY, USA, 2009. ACM.
  2. P. Venetis, G. Koutrika, and H. Garcia-Molina. On the selection of tags for tag clouds. In Proceedings of the fourth ACM international conference on Web search and data mining,
    WSDM ’11, pages 835–844, New York, NY, USA, 2011. ACM.
  3. B. Y.-L. Kuo, T. Hentrich, B. M. . Good, and M. D. Wilkinson. Tag clouds for summarizing web search results. In Proceedings of the 16th international conference on World Wide Web,
    WWW ’07, pages 1203–1204, New York, NY, USA, 2007. ACM.


Identifying Similarities, Periodicities and Bursts for Online Events

Monitoring manually events on a time-series basis is expensive and humanistically difficult. Techniques such as Discrete Fourier Transform (DFT) and Markov can be applied to model information flows and detect aperiodic burst in time series analysis. In practical terms, such methods could be applied to detect anomalies in temporal events such as seasonal epidemics or sales variances on marketplaces. Evaluation can be made on available datasets on the Web that report log events in any domain.

References:

  1. Analyzing Feature Trajectories for Event Detection, Qi He, Kuiyu Chang and Ee-Peng Lim. SIGIR 2007
  2. G. P. C. Fung, J. X. Yu, P. S. Yu, and H. Lu. Parameter free bursty events detection in text streams. In VLDB, pages 181–192, 2005.
  3. Q.He,K.Chang,E.-P.Lim,andJ.Zhang.Burstyfeature reprensentation for clustering text streams. In SDM, accepted, 2007.

Using Social Tagging for Adaptation and Recommendation  purposes

Social tagging is the process that users bookmark the interested URLs to a public Web site and annotate them with a few free-text keywords. With social tagging, a user is able to explicitly expresses the own description on Web resources like images, videos, scientific articles, thus allowing other like-minded users to fulfill the request of finding the similar contents over the web.  Apparently Mining tag information reveals the topic-domain of users’ interests and significantly contributes in a profile construction process, in turn, for web personalization or recommendation. Nowadays Tagging analysis is emerging as a challenging topic in the research domains of intelligent web applications.

Intuitively tags could be considered as key terms with specific indications by users. Thus the conventional algorithms and techniques that used for information retrieval and text mining could be directly applied in processing tags such as using tf/idf expression to measure the similarity between tags and compute the relatedness of two documents or articles. However, due to the limitation of lexicon expression that known in information retrieval, using the tags or keywords alone to characterize the web resource is sometimes not sufficient enough to obtain the satisfactory results. Some efforts have been contributed from the perspective of natural language processing (NLP), machine learning (ML) and data mining (DM) to expand the modeling the tags with additional features, thus enriching the semantics the web resources for social tagging systems. For example [1] proposed a solution to improve the semantic expression of tags by using WorldNet for tag-based web recommendation systems. On the other hand, however, tag is one kind of so-called semantic collaborative information that in fact constitutes the uncontrolled and user defined vocabulary. Therefore it is expected that using machine learning approaches in tag mining will undoubtedly help to enrich the semantics of tags for better modeling social tagging system.

In this project you can therefore look at number of options:

  • Clustering: The use of clustering techniques can greatly improve the search performance in Social Bookmarking Systems, if tags with similar sense are grouped in the same cluster. The goal of this research is therefore to investigate the state of the art of clustering techniques and propose a novel approach to improve the search performance in Social Bookmarking Systems.
  • Nearest neighbour classification: The nearest neighburs and tags based approach process the data and reflect typical content of the document. Tags and neighbours classify document and return with high precsion results by a search engine. Regarding the highest ranked neighbours is influencing the tag and content meaning. The main aim of this proposal is to analyse the nearest tag neighbours effection in the tag based search.
  • Latent topic analysis: The above challenges can be also addressed via the approaches of latent topic analysis as an extension work of [1]. In particular you can use an archived Wikipedia article collection as a reference data source to train a latent topic model, with which each tag is mapped to a learned topic space with a certain probability distribution. With this additional feature of tag, a new similarity function between tags will be defined by combining the popularity, representativeness and the enriched semantic factors of tags into consideration. Eventually we compute the similarity of web resources based on the defined tag similarity for tag based web recommendation.
  • Other approaches of machine learning, data mining, similarity computation and so on

The research performed on those appraoches is mostly experimental on a specific data set to evaluate usually precision and recall on returned and computed results.

References:

  1. Frederico Durao, Peter Dolog , Extending a Hybrid Tag-Based Recommender System with Personalization, in 25th ACM Symposium on Applied Computing, ACM Press, Sierre, Switzerland (2010)
  2. Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering @ http://www.springerlink.com/content/e27481301647w348
  3. A Neighborhood Search Method for Link-Based Tag Clustering @ http://dx.doi.org/10.1007/978-3-642-03348-3_12
  4. Core-Tag Clustering for Web 2.0 Based on Multi-similarity Measurements @ http://dx.doi.org/10.1007/978-3-642-03996-6_21
  5. A Tag Clustering Method to Deal with Syntactic Variations on Collaborative Social Networks @ http://dx.doi.org/10.1007/978-3-642-02818-2_35
  6. J. Hu et al, Enhancing Text Clustering by Leveraging Wikipedia Semantics (http://doi.acm.org/10.1145/1390334.1390367), in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (2006), pp 179-184
  7.  X. Hu et al, Exploiting Wikipedia as External Knowledge for Document Clustering (http://doi.acm.org/10.1145/1557019.1557066), in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (2009), pp389-396
  8. A K-nearest neighbour method for classifying for web search results with data in Folkonomies: http://www.albertauyeung.com/papers/wi08-auyeung.pdf
  9. Neighbour based tag prediction: http://people.csail.mit.edu/pcm/papers/TagPrediction.pdf