主页：高级混合词的嵌入代表学习

论文标题

主页：高级混合词的嵌入代表学习

HOME: High-Order Mixed-Moment-based Embedding for Representation Learning

论文作者

Niu, Chuang, Wang, Ge

论文摘要

在潜在空间中嵌入的不同元素之间的最小冗余是表示内部信息结构的基本要求或主要偏好。当前的自我监督学习方法最大程度地减少了配对协方差矩阵，以减少特征冗余并产生有希望的结果。但是，多个变量的这种表示特征可能包含两个以上特征变量的冗余，这些变量无法通过成对正规化最小化。在这里，我们提出了高阶混合词的嵌入（家庭）策略，以减少任何特征变量集之间的冗余，这是我们最好的知识，这是在此上下文中使用高阶统计/信息的首次尝试。当且仅当多个变量相互独立时，多变量互信息是最小的，这表明多个变量之间分解混合矩的必要条件。基于这些统计和信息理论原则，我们的一般家庭框架被提出用于自我监督的代表学习。我们的初始实验表明，以三阶家用方案形式的一个简单版本已经显着优于当前的两阶基线方法（即Barlow Twins），就表示特征的线性评估而言。

Minimum redundancy among different elements of an embedding in a latent space is a fundamental requirement or major preference in representation learning to capture intrinsic informational structures. Current self-supervised learning methods minimize a pair-wise covariance matrix to reduce the feature redundancy and produce promising results. However, such representation features of multiple variables may contain the redundancy among more than two feature variables that cannot be minimized via the pairwise regularization. Here we propose the High-Order Mixed-Moment-based Embedding (HOME) strategy to reduce the redundancy between any sets of feature variables, which is to our best knowledge the first attempt to utilize high-order statistics/information in this context. Multivariate mutual information is minimum if and only if multiple variables are mutually independent, which suggests the necessary conditions of factorized mixed moments among multiple variables. Based on these statistical and information theoretic principles, our general HOME framework is presented for self-supervised representation learning. Our initial experiments show that a simple version in the form of a three-order HOME scheme already significantly outperforms the current two-order baseline method (i.e., Barlow Twins) in terms of the linear evaluation on representation features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题