博文

[读论文]---061 基于邻居的协同过滤的设计选择的一个实验分析

已有 2242 次阅读 2016-5-31 15:32 |系统分类:科研笔记

An Empirical Analysis of Design Choices in Neighborhood-Based Collaborative Filtering Algorithms

基于邻居的协同过滤的设计选择的一个实验分析

Abstract. Collaborative filtering systems predict a user’s interest in new items based on the recommendations of other people with similar interests. Instead of performing content indexing or content analysis, collaborative filtering systems rely entirely on interest ratings from members of a participating community. Since predictions are based on human ratings, collaborative filtering systems have the potential to provide filtering based on complex attributes, such as quality, taste, or aesthetics. Many implementations of collaborative filtering apply some variation of the neighborhood-based prediction algorithm. Many variations of similarity metrics, weighting approaches, combination measures, and rating normalization have appeared in each implementation. For these parameters and others, there is no consensus as to which choice of technique is most appropriate for what situations, nor how significant an effect on accuracy each parameter has. Consequently, every person implementing a collaborative filtering system must make hard design choices with little guidance. This article provides a set of recommendations to guide design of neighborhood-based prediction systems, based on the results of an empirical study. We apply an analysis framework that divides the neighborhood-based prediction approach into three components and then examines variants of the key parameters in each component. The three components identified are similarity computation, neighbor selection, and rating combination.

协同过滤系统预测某个用户对新的商品的兴趣是基于其他和他相似的用户的兴趣。不同于通过内容索引和分析，协同过滤系统完全依赖参与的社区的兴趣打分。由于预测是基于人的打分，协同过滤系统的潜能就是提供基于复杂属性的过滤，例如质量、爱好，或者审美。很多相似性的矩阵，权重的方法，综合的方法，以及评估归一化的方法都在每个应用中出现。对这些参数以及其他的，对于哪种技术的选择最适合那种情况还没有共识，对于每个参数到底有多大的影响也没有共识。结果，每个人在一个协同过滤系统必须有很硬的选择而只有很少的引导。本文提供了一个推荐集合来引导基于邻居关系的系统的设计，基于一个经验研究的结果。我们使用了一个分析框架景基于邻居关系的方法分为三个部分然后检验每个部分的关键参数及其变种。这三个被检验的部分是相似性计算、邻居的选择以及评价（Rating）的组合。

本文研究的是基于协同过滤算法的推荐系统经验分析。显然，本文针对的问题是目前关于在什么条件下使用什么样的参数，这样的参数到底起到多大的作用还没有一个明确的共识。然后本文想通过实验来进行研究。然后作者针对推荐系统的三个重要的模块（components）进行了检验和分析。这三个模块就是相似性计算，邻居的选择以及评价（Rating）组合。这篇文章应该属于综述加实证类的文章，引用500多次，证明这里的结论非常值得参考。

转载本文请联系原作者获取授权，同时请注明本文来自曹建平科学网博客。
链接地址：https://blog.sciencenet.cn/blog-656867-981483.html

上一篇：[读论文]---060 植入性分割模型的图划分算法
下一篇：[读论文]---062 一个自然语言处理的统一的架构：面向多种任务的

收藏 IP: 222.240.177.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

曹建平

扫一扫，分享此博文

cjpnudt的个人博客分享 http://blog.sciencenet.cn/u/cjpnudt

博文

[读论文]---061 基于邻居的协同过滤的设计选择的一个实验分析

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

曹建平

全部作者的其他最新博文

全部精选博文导读

相关博文

cjpnudt的个人博客分享 http://blog.sciencenet.cn/u/cjpnudt

博文

[读论文]---061 基于邻居的协同过滤的设计选择的一个实验分析

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

曹建平

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)