论文标题
通过潜在块模型从缺失的数据中学习
Learning from missing data with the Latent Block Model
论文作者
论文摘要
缺少数据可能是有益的。当数据模型不允许从丢失的数据中提取信息时,忽略此信息可能会导致误导性结论。我们提出了一个基于潜在块模型的共聚类模型,该模型旨在利用这种不可签名的非反应,也称为“丢失”,而不是随机数据(MNAR)。差异期望最大化算法得出以执行推断,并提出了模型选择标准。我们在模拟研究中评估了提出的方法,然后在法国议会下议院的投票记录上使用我们的模型,我们的分析将相关的国会议员和文本群体以及对非投票者行为的明智解释。
Missing data can be informative. Ignoring this information can lead to misleading conclusions when the data model does not allow information to be extracted from the missing data. We propose a co-clustering model, based on the Latent Block Model, that aims to take advantage of this nonignorable nonresponses, also known as Missing Not At Random data (MNAR). A variational expectation-maximization algorithm is derived to perform inference and a model selection criterion is presented. We assess the proposed approach on a simulation study, before using our model on the voting records from the lower house of the French Parliament, where our analysis brings out relevant groups of MPs and texts, together with a sensible interpretation of the behavior of non-voters.