Wasserstein在高维度中的分布在强大的估计中：性能分析和最佳超参数调整

论文标题

Wasserstein在高维度中的分布在强大的估计中：性能分析和最佳超参数调整

Wasserstein Distributionally Robust Estimation in High Dimensions: Performance Analysis and Optimal Hyperparameter Tuning

论文作者

Aolaritei, Liviu, Shafiee, Soroosh, Dörfler, Florian

论文摘要

分布强劲的优化（DRO）已成为不确定性下估算的有力框架，提供了强大的样本外部性能和原则上的正则化。在本文中，我们提出了一种基于DRO的线性回归方法，并解决了一个中心问题：如何最佳选择稳健性半径，该半径可控制鲁棒性和准确性之间的权衡。着眼于高维设置，其中的尺寸和样品数量均大且大小可比，我们采用了从高维渐近统计的工具来精确地表征所得估计器的估计误差。值得注意的是，可以通过求解仅涉及四个标量变量的简单凸 - 孔孔优化问题来恢复此误差。该表征可以有效地选择半径，以最大程度地减少估计误差。这样一来，它的效果与交叉验证相同，但占计算成本的一部分。数值实验证实，我们的理论预测与经验性能紧密匹配，并且通过我们的方法选择的最佳半径与交叉验证选择的方法一致，从而强调了我们方法的准确性和实际好处。

Distributionally robust optimization (DRO) has become a powerful framework for estimation under uncertainty, offering strong out-of-sample performance and principled regularization. In this paper, we propose a DRO-based method for linear regression and address a central question: how to optimally choose the robustness radius, which controls the trade-off between robustness and accuracy. Focusing on high-dimensional settings where the dimension and the number of samples are both large and comparable in size, we employ tools from high-dimensional asymptotic statistics to precisely characterize the estimation error of the resulting estimator. Remarkably, this error can be recovered by solving a simple convex-concave optimization problem involving only four scalar variables. This characterization enables efficient selection of the radius that minimizes the estimation error. In doing so, it achieves the same effect as cross-validation, but at a fraction of the computational cost. Numerical experiments confirm that our theoretical predictions closely match empirical performance and that the optimal radius selected through our method aligns with that chosen by cross-validation, highlighting both the accuracy and the practical benefits of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题