论文标题

与基于随机竞争的激活进行竞争的相互信息约束,以学习多元化表示

Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

论文作者

Panousis, Konstantinos P., Antoniadis, Anastasios, Chatzis, Sotirios

论文摘要

这项工作旨在解决学习多元化表示的悠久问题。为此,我们将信息理论论证与基于随机竞争的激活相结合,即随机的本地赢家 - 全部(LWTA)单位。在这种情况下,我们放弃了依赖非线性激活的表示学习中常用的常规深度体系结构。取而代之的是,我们用本地和随机竞争的线性单元代替它们。在这种情况下,每个网络层产生稀疏的输出,这取决于组织成竞争者块的单位之间的竞争结果。我们对竞争机制采用随机论点,该论点执行后抽样以确定每个区块的获胜者。我们进一步赋予了所考虑的网络,以推断网络的子部分,这对于对手头的数据进行建模至关重要。我们将适当的破坏先验施加在这一目标上。为了进一步丰富新兴表示的信息,我们求助于信息理论原则,即信息竞争过程(ICP)。然后,所有组件都在随机变化贝叶斯框架下进行推断。我们使用基于图像分类的基准数据集对我们的方法进行了彻底的实验研究。正如我们在实验中表明的那样,所得网络具有显着的歧视性表示能力。此外,引入的范式允许对新兴中间网络表示的原则研究机制进行原则研究机制。

This work aims to address the long-established problem of learning diversified representations. To this end, we combine information-theoretic arguments with stochastic competition-based activations, namely Stochastic Local Winner-Takes-All (LWTA) units. In this context, we ditch the conventional deep architectures commonly used in Representation Learning, that rely on non-linear activations; instead, we replace them with sets of locally and stochastically competing linear units. In this setting, each network layer yields sparse outputs, determined by the outcome of the competition between units that are organized into blocks of competitors. We adopt stochastic arguments for the competition mechanism, which perform posterior sampling to determine the winner of each block. We further endow the considered networks with the ability to infer the sub-part of the network that is essential for modeling the data at hand; we impose appropriate stick-breaking priors to this end. To further enrich the information of the emerging representations, we resort to information-theoretic principles, namely the Information Competing Process (ICP). Then, all the components are tied together under the stochastic Variational Bayes framework for inference. We perform a thorough experimental investigation for our approach using benchmark datasets on image classification. As we experimentally show, the resulting networks yield significant discriminative representation learning abilities. In addition, the introduced paradigm allows for a principled investigation mechanism of the emerging intermediate network representations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源