论文标题
FedCTR:使用多平台用户行为数据联合的本机AD CTR预测
FedCTR: Federated Native Ad CTR Prediction with Multi-Platform User Behavior Data
论文作者
论文摘要
本机广告是一种流行的在线广告类型,具有与网站上显示的本机内容相似的形式。本地AD CTR预测可用于改善用户体验和平台收入。但是,由于缺乏明确的用户意图,这是具有挑战性的,而用户在本机广告中的用户行为可能不足以推断其对广告的兴趣。幸运的是,用户行为存在于许多在线平台上,它们可以为用户兴趣挖掘提供补充信息。因此,利用多平台用户行为对于本机AD CTR预测有用。但是,用户行为对高度隐私敏感,并且由于用户隐私问题和数据保护法规(如GDPR),不同平台上的行为数据无法直接汇总。现有的CTR预测方法通常需要将用户行为数据集中存储以进行用户建模,并且不能直接应用于使用多平台用户行为的CTR预测任务。在本文中,我们提出了一种名为FedCtr的联合本机AD CTR预测方法,该方法可以以隐私保护方式从多个平台上的行为中学习用户兴趣表示。在每个平台上,本地用户模型都用于从该平台上的本地用户行为学习用户嵌入。来自不同平台的本地用户嵌入将上传到服务器进行聚合,并将汇总的用户嵌入发送到AD平台以进行CTR预测。此外,我们分别对本地和汇总的用户嵌入来应用不民产党和DP技术,以提供更好的隐私保护。此外,我们为使用分布式模型和用户行为的模型培训提出了一个联合框架。对现实世界数据集的广泛实验表明,FedCtr可以以隐私保护方式有效利用本机AD CTR预测的多平台用户行为。
Native ad is a popular type of online advertisement which has similar forms with the native content displayed on websites. Native ad CTR prediction is useful for improving user experience and platform revenue. However, it is challenging due to the lack of explicit user intent, and users' behaviors on the platform with native ads may not be sufficient to infer their interest in ads. Fortunately, user behaviors exist on many online platforms and they can provide complementary information for user interest mining. Thus, leveraging multi-platform user behaviors is useful for native ad CTR prediction. However, user behaviors are highly privacy-sensitive and the behavior data on different platforms cannot be directly aggregated due to user privacy concerns and data protection regulations like GDPR. Existing CTR prediction methods usually require centralized storage of user behavior data for user modeling and cannot be directly applied to the CTR prediction task with multi-platform user behaviors. In this paper, we propose a federated native ad CTR prediction method named FedCTR, which can learn user interest representations from their behaviors on multiple platforms in a privacy-preserving way. On each platform a local user model is used to learn user embeddings from the local user behaviors on that platform. The local user embeddings from different platforms are uploaded to a server for aggregation, and the aggregated user embeddings are sent to the ad platform for CTR prediction. Besides, we apply LDP and DP techniques to the local and aggregated user embeddings respectively for better privacy protection. Moreover, we propose a federated framework for model training with distributed models and user behaviors. Extensive experiments on real-world dataset show that FedCTR can effectively leverage multi-platform user behaviors for native ad CTR prediction in a privacy-preserving manner.