论文标题
最大主要熵抽样问题的最佳主要主要子正确选择:可扩展算法和性能保证
Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees
论文作者
论文摘要
本文研究了一个经典的最大熵抽样问题(MESP),该问题旨在从协方差矩阵中选择预先指定尺寸的最有用的主体子atrix。 MESP已广泛应用于许多领域,包括医疗保健,电力系统,制造和数据科学。通过调查其拉格朗日双重和原始特征,我们得出了一个用于MESP的新型凸整数程序,并表明其连续的松弛产生了近乎最佳的解决方案。结果激发了我们研究有效的采样算法,并开发了其对MESP的近似结合,从而改善了文献中最著名的结合。然后,我们提供具有相同近似结合的采样算法的有效确定性实现。通过为奇异矩阵开发新的数学工具,并分析了拟议的凸整数程序的拉格朗日双重二元,我们研究了广泛使用的本地搜索算法,并证明了其第一个已知的MESP绑定的近似值。证明技术通过有效实施本地搜索算法进一步激发了我们的灵感。我们的数值实验表明,这些近似算法可以有效地将中型和大规模的实例求解到近距离的情况下。我们提出的算法被编码并作为开源软件发布。最后,我们将分析扩展到A-最佳MESP(A-MESP),其中目的是最大程度地减少所选主符号的反向的轨迹。
This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix of a prespecified size from a covariance matrix. MESP has been widely applied to many areas, including healthcare, power system, manufacturing and data science. By investigating its Lagrangian dual and primal characterization, we derive a novel convex integer program for MESP and show that its continuous relaxation yields a near-optimal solution. The results motivate us to study an efficient sampling algorithm and develop its approximation bound for MESP, which improves the best-known bound in literature. We then provide an efficient deterministic implementation of the sampling algorithm with the same approximation bound. By developing new mathematical tools for the singular matrices and analyzing the Lagrangian dual of the proposed convex integer program, we investigate the widely-used local search algorithm and prove its first-known approximation bound for MESP. The proof techniques further inspire us with an efficient implementation of the local search algorithm. Our numerical experiments demonstrate that these approximation algorithms can efficiently solve medium-sized and large-scale instances to near-optimality. Our proposed algorithms are coded and released as open-source software. Finally, we extend the analyses to the A-Optimal MESP (A-MESP), where the objective is to minimize the trace of the inverse of the selected principal submatrix.