论文标题

高维多任务平均和对内核平均嵌入的应用

High-Dimensional Multi-Task Averaging and Application to Kernel Mean Embedding

论文作者

Marienwald, Hannah, Fermanian, Jean-Baptiste, Blanchard, Gilles

论文摘要

我们为多任务平均问题提出了改进的估计器,其目标是使用单独的独立数据集对多个分布的平均值进行联合估计。幼稚的方法是将每个数据集的经验平均值分别采用,而所提出的方法利用任务之间的相似性,而没有事先知道任何相关信息。首先,对于每个数据集,通过多个测试从数据确定相似或相邻的均值。然后,每个天真的估计量都缩小到邻居的当地平均水平。从理论上讲,我们证明这种方法可减少均方误差。当输入空间的维度很大时,这种改进可能会很大,这表明了“维度的祝福”现象。这种方法的应用是估计多种内核平均嵌入式,这在许多现代应用中起着重要作用。理论结果在人工和现实世界数据上得到了验证。

We propose an improved estimator for the multi-task averaging problem, whose goal is the joint estimation of the means of multiple distributions using separate, independent data sets. The naive approach is to take the empirical mean of each data set individually, whereas the proposed method exploits similarities between tasks, without any related information being known in advance. First, for each data set, similar or neighboring means are determined from the data by multiple testing. Then each naive estimator is shrunk towards the local average of its neighbors. We prove theoretically that this approach provides a reduction in mean squared error. This improvement can be significant when the dimension of the input space is large, demonstrating a "blessing of dimensionality" phenomenon. An application of this approach is the estimation of multiple kernel mean embeddings, which plays an important role in many modern applications. The theoretical results are verified on artificial and real world data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源