无缝统一属性和项目：冷启动用户的对话推荐

论文标题

无缝统一属性和项目：冷启动用户的对话推荐

Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users

论文作者

Li, Shijun, Lei, Wenqiang, Wu, Qingyun, He, Xiangnan, Jiang, Peng, Chua, Tat-Seng

论文摘要

诸如协作过滤之类的静态建议方法受到对冷启动用户执行实时个性化的固有限制。在线推荐，例如，多军强盗方法，通过在线互动探索用户偏好并追求探索探索（EE）折衷来解决此限制。但是，现有的基于匪徒的方法均匀地模型建议动作。具体而言，他们仅将项目视为武器，无法处理项目属性，这些属性自然提供了用户当前需求的可解释信息，并且可以有效地滤除不希望的项目。在这项工作中，我们考虑了Cold-Start用户的对话推荐，系统都可以在其中询问属性并将项目推荐给用户。在最近的一项工作中研究了这种重要的情况。但是，它采用手工制作的功能来决定何时询问属性或提出建议。这种属性和项目的单独建模使系统的有效性高度依赖于手工制作的功能的选择，从而将脆弱性引入系统。为了解决这一限制，我们无缝地统一了同一手臂空间中的属性和项目，并使用汤普森采样框架自动实现EE权衡。我们的会话汤普森采样（续）模型通过以最大的奖励选择手臂，从而在对话推荐中整体上解决了所有问题。在三个基准数据集上进行的广泛实验表明，在成功率和平均对话次数的平均数量的指标中，均超过了最先进的方法对话性UCB（CONUCB）和估算行动反射模型。

Static recommendation methods like collaborative filtering suffer from the inherent limitation of performing real-time personalization for cold-start users. Online recommendation, e.g., multi-armed bandit approach, addresses this limitation by interactively exploring user preference online and pursuing the exploration-exploitation (EE) trade-off. However, existing bandit-based methods model recommendation actions homogeneously. Specifically, they only consider the items as the arms, being incapable of handling the item attributes, which naturally provide interpretable information of user's current demands and can effectively filter out undesired items. In this work, we consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. This important scenario was studied in a recent work. However, it employs a hand-crafted function to decide when to ask attributes or make recommendations. Such separate modeling of attributes and items makes the effectiveness of the system highly rely on the choice of the hand-crafted function, thus introducing fragility to the system. To address this limitation, we seamlessly unify attributes and items in the same arm space and achieve their EE trade-offs automatically using the framework of Thompson Sampling. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play. Extensive experiments on three benchmark datasets show that ConTS outperforms the state-of-the-art methods Conversational UCB (ConUCB) and Estimation-Action-Reflection model in both metrics of success rate and average number of conversation turns.

下载PDF全文

下载文献需遵守相关版权规定

论文标题