Summary of readings

K.Reddy's picture

Submitted by K.Reddy on Mon, 03/29/2010 - 13:03.

Type:

Task

Presentation status:

progress


Paper title	authors	summary	links
Kernel Nearest-Neighbor Algorithm	KAI YU, LIANG JI* and XUEGONG ZHANG	The ‘kernel approach’ is applied to modify norm distance metric in Hilbert space, and then nn algorithm becomes kernel nearest-neighbor algorithm. In some specific conditions, such as polynomial kernel p=1 or radial basis kernel, it degenerates to conventional nearest-neighbor algorithm. By choosing an appropriate kernel function, the results of kernel nearest-neighbor algorithm are better than those of conventional nearest-neighbor algorithm	http://www.springerlink.com/content/hqg0keryj8tuftyg/
Scalable Collaborative Filtering with Jointly Derived Neighborhood InterpolationWeights	Robert M. Bell and Yehuda Koren	The collobarating filtering through neighbourhood based interpolation weights, which are used to estimate unknown ratings from neighboring known ones. Nevertheless, the literature lacks a rigorous way to derive these weights. This work showed how the interpolation weights can be computed as a global solution to an optimization problem that precisely reflects their role. Comparison with past kNN methods on the Netflix data, demonstrated a significant improvement of prediction accuracy without a meaningful increase in running time. A kNN method can be most effectively employed in an item-oriented manner, by analyzing relationships between items.	http://portal.acm.org/citation.cfm?id=1442050
Automated Tag Clustering: Improving search and exploration in the tag space	Grigory Begelman Philipp Keller Frank Smadja	Presented work is convincing evidence that clustering techniques can and should be used in combination with tagging. Clustering can improve the tagging experience and the use of the tagspace in general. They have presented several clustering techniques and provided some results obtained on the del.icio.us .	http://www.pui.ch/phred/automated_tag_clustering/
Autotagging to Improve Text Search for 3D Models	Corey Goldfeder Peter Allen	The demonstration of an automatic tagging system that learns new tags for a 3D model by comparing it to a large set of tagged models and probabilistically propagating tags from neighbors. They shown that autotagging to improve shape retrieval in a digital library, there are several other domains where automatically annotating 3D models can be helpfulthe discriminative power of these tags is comparable to that of the underlying geometric similarity distance, and that searching for models based on our autotags can result in better precision and greater recall than searching on the original tags.	http://portal.acm.org/citation.cfm?id=1378889.1378950
Harvesting Social Knowledge from Folksonomies	Harris Wu Mohammad Zubair Kurt Maly	Collaborative tagging systems have the potential of becoming a technological infrastructure for harvesting social knowledge. There are many challenges, the designed prototypes that enhance social tagging systems to meet some of the key challenges. they developed a comprehensive evaluation methodology.	http://portal.acm.org/citation.cfm?id=1149941.1149962
Classification-Enhanced Ranking	Paul N. Bennett Krysta Svore Susan T. Dumais	In this work,demonstrated that topical class information can be used to improve the relevance of retrieval resultsby generalizing along the class dimension to identify other relevant results. In order to do this, they introduced a natural denition of query class that stems from the results that are relevant for that query and can be estimated using click behavior. Approach is notable for its focus on directly improving ranking relevance rather than indirect measures like query classication.	http://portal.acm.org/citation.cfm?id=1772703&dl=ACM
Exploiting Query Reformulations for Web Search Result Diversification	Rodrygo L. T. Santos Craig Macdonald Iadh Ounis	Introduced a novel probabilistic framework for search result diversification. In particular, the xQuAD (eXplicit Query Aspect Diversification) framework explicitly models the aspects underlying an initial query, in the form of sub-queries. Instead of comparing documents to one another—which usually demands expensive computations—our approach achieves an effective diversification performance by directly estimating the relevance of the retrieved documents to multiple sub-queries. Besides being efficient in practice, the principled formulation of xQuAD naturally models several dimensions of interest in a diversification task, as components within the framework. These include the relevance of a document to an initial query and its multiple aspects, identified as sub-queries, as well as the relative importance of each sub-query and how novel a document satisfying each sub-query is.	http://portal.acm.org/citation.cfm?id=1772690.1772780
Personalized Query Expansion for the Web	Paul - Alexandru Chirita Claudiu S. Firan Wolfgang Nejdl	Proposed to expand Web search queries by exploiting the user’s Personal Information Repository in order to automatically extract additional keywords related both to the query itself and to user’s interests, personalizing the search output	http://portal.acm.org/citation.cfm?id=1277741.1277746
Accurate Methods for the Statistics of Surprise and Coincidence	Ted Dunning*	loglikelihood http://acl.ldc.upenn.edu/J/J93/J93-1003.pdf	http://portal.acm.org/citation.cfm?id=972454
Finding Relevant Concepts for Unknown Terms Using a Web-based Approach	Chen-Ming Hung1 and Lee-Feng Chien1	Presented a potential approach to finding relevant concepts for terms via utilizing World Wide Web. This approach obtained an encouraging experimental result in testing Yahoo!’s computer science hierarchy. However, the work needs more in-depth study. As what we mentioned previously, choosing the word with the highest weighted log likelihood ratio as the concept of a clustered group after the Greedy EM algorithm does not provide enough representative. In addition, one concept usually contains many domains, e.g. “ATM” contains security, teller machine, transaction cost, and etc. Thus, distinguishing the extracted keywords into a certain concept still needs human intervention. On the other hand, in order to solve the problem of “too much effort” of the Greedy EM algorithm, we need to modify it with another convergence criterion.	http://www.aclweb.org/anthology/O/O04/O04-1007.pdf
Improving Term Extraction Using Particle Swarm Optimization Techniques	Mohammad Syafrullah and Naomie Salim	Presented a particle swarm optimization technique to improve term extraction precision. they choose five features to represent the term score: domain relevance, domain consensus, term cohesion, first occurrence and length of noun phrase. In the experiments, we use a translation of the meaning of the Quran (focus on verses of prayer) as an input document, both for training and testing phases. separate the documents between training documents and test documents. Particles swarm optimization is trained using the training documents to determine the appropriate weight of each feature to produce the best score for each term. We conduct tests with the test document using the weight of each feature which is generated from the training stage to calculate the final score for each term to be extracted.	http://www.scipub.org/fulltext/jcs/jcs63323-329.pdf

»

Login to post comments