论文标题
选举投票中的偏见和过度差异:不太隐藏的马尔可夫模型
Bias and Excess Variance in Election Polling: A Not-So-Hidden Markov Model
论文作者
论文摘要
由于在2016年和2020年美国总统大选中的历史性失误,衡量投票错误的兴趣增加了。测量选举后验尸期间测量方向误差和非采样过剩可变性的最常见方法是评估选举一天的几天内进行的民意调查结果和选举结果之间的差异。众所周知,分析此类轮询错误数据是很困难的,因为典型模型对民意调查和选举之间的时间极为敏感。我们利用传统上用于选举预测的隐藏的马尔可夫模型灵活地捕获时变的偏好,并将选举结果视为通常隐藏的马尔可夫进程的峰值。我们的结果对时间窗口的选择不太敏感,避免将转移偏好与轮询误差混为一谈,尽管模型高度灵活,但更容易解释。我们通过2004年至2020年美国总统选举以及1992年至2020年美国参议院选举的民意调查的数据证明了这些结果,得出的结论是,先前报道的总统选举中偏见的估计值太极端了10 \%,参议员选举中的估计偏见太极端到了25 \%,而超级可变性估计也太大了。
With historic misses in the 2016 and 2020 US Presidential elections, interest in measuring polling errors has increased. The most common method for measuring directional errors and non-sampling excess variability during a postmortem for an election is by assessing the difference between the poll result and election result for polls conducted within a few days of the day of the election. Analyzing such polling error data is notoriously difficult with typical models being extremely sensitive to the time between the poll and the election. We leverage hidden Markov models traditionally used for election forecasting to flexibly capture time-varying preferences and treat the election result as a peak at the typically hidden Markovian process. Our results are much less sensitive to the choice of time window, avoid conflating shifting preferences with polling error, and are more interpretable despite a highly flexible model. We demonstrate these results with data on polls from the 2004 through 2020 US Presidential elections and 1992 through 2020 US Senate elections, concluding that previously reported estimates of bias in Presidential elections were too extreme by 10\%, estimated bias in Senatorial elections was too extreme by 25\%, and excess variability estimates were also too large.