论文标题
使用机器学习的概率天气预测后期处理的概率预测的框架
A framework for probabilistic weather forecast post-processing across models and lead times using machine learning
论文作者
论文摘要
预测天气是一个越来越多的数据密集型练习。数值天气预测(NWP)模型正在变得越来越复杂,分辨率更高,并且在运行中越来越多。虽然NWP模型的预测技能继续提高,但这些模型的数量和复杂性对运营气象学家构成了新的挑战:将所有可用模型的信息(每种信息都具有其自身独特的偏见和局限性)合并,以便为利益相关者提供良好的概率概率预测,以在决策中使用?在本文中,我们使用道路表面温度示例演示了一个三阶段的框架,该框架使用机器学习来弥合NWP模型的单独预测集和决策支持的“理想”预测:未来天气结果的概率。首先,我们使用分位数回归林来学习每个数值模型的误差曲线,并使用这些林格将经验衍生的概率分布应用于预测。其次,我们使用分位数平均结合了这些概率预测。第三,我们在聚集体之间插值以生成完整的预测分布,我们证明,该分布具有适合决策支持的属性。我们的结果表明,这种方法为跨多个模型的天气预测的内聚后进行了有效且可行的框架,并产生了良好的概率输出。
Forecasting the weather is an increasingly data intensive exercise. Numerical Weather Prediction (NWP) models are becoming more complex, with higher resolutions, and there are increasing numbers of different models in operation. While the forecasting skill of NWP models continues to improve, the number and complexity of these models poses a new challenge for the operational meteorologist: how should the information from all available models, each with their own unique biases and limitations, be combined in order to provide stakeholders with well-calibrated probabilistic forecasts to use in decision making? In this paper, we use a road surface temperature example to demonstrate a three-stage framework that uses machine learning to bridge the gap between sets of separate forecasts from NWP models and the 'ideal' forecast for decision support: probabilities of future weather outcomes. First, we use Quantile Regression Forests to learn the error profile of each numerical model, and use these to apply empirically-derived probability distributions to forecasts. Second, we combine these probabilistic forecasts using quantile averaging. Third, we interpolate between the aggregate quantiles in order to generate a full predictive distribution, which we demonstrate has properties suitable for decision support. Our results suggest that this approach provides an effective and operationally viable framework for the cohesive post-processing of weather forecasts across multiple models and lead times to produce a well-calibrated probabilistic output.