论文标题
用分布不平等的指标测量内容建议算法的不同结果
Measuring Disparate Outcomes of Content Recommendation Algorithms with Distributional Inequality Metrics
论文作者
论文摘要
算法决策系统的有害影响最近引起了人们的关注,许多系统的示例,例如机器学习(ML)模型,以扩大现有的社会偏见。大多数试图量化ML算法差异差异的指标都集中在组之间的差异上,根据人口统计学身份将用户划分,并比较这些组之间的模型性能或整体成果。但是,在行业环境中,这种信息通常不可用,并且推断这些特征具有自身的风险和偏见。此外,关注单个分类器输出的典型指标忽略了在现实世界中产生结果的复杂系统网络。在本文中,我们评估了一组源自经济学,分配不平等指标的指标及其在生产建议系统中衡量内容暴露差异的能力,即Twitter算法时间表。我们定义了在操作环境中,特别是ML从业人员中使用的指标的理想标准。我们使用这些指标来表征Twitter上与内容的不同类型的参与,并使用这些结果来评估根据所需标准的指标。我们表明,我们可以使用这些指标来识别内容建议算法,这些算法对用户之间的偏斜结果更为强烈。总体而言,我们得出的结论是,这些指标可以是理解在线社交网络中不同结果的有用工具。
The harmful impacts of algorithmic decision systems have recently come into focus, with many examples of systems such as machine learning (ML) models amplifying existing societal biases. Most metrics attempting to quantify disparities resulting from ML algorithms focus on differences between groups, dividing users based on demographic identities and comparing model performance or overall outcomes between these groups. However, in industry settings, such information is often not available, and inferring these characteristics carries its own risks and biases. Moreover, typical metrics that focus on a single classifier's output ignore the complex network of systems that produce outcomes in real-world settings. In this paper, we evaluate a set of metrics originating from economics, distributional inequality metrics, and their ability to measure disparities in content exposure in a production recommendation system, the Twitter algorithmic timeline. We define desirable criteria for metrics to be used in an operational setting, specifically by ML practitioners. We characterize different types of engagement with content on Twitter using these metrics, and use these results to evaluate the metrics with respect to the desired criteria. We show that we can use these metrics to identify content suggestion algorithms that contribute more strongly to skewed outcomes between users. Overall, we conclude that these metrics can be useful tools for understanding disparate outcomes in online social networks.