DHOG：深层对象分组

论文标题

DHOG：深层对象分组

DHOG: Deep Hierarchical Object Grouping

论文作者

Darlow, Luke Nicholas, Storkey, Amos

论文摘要

最近，许多竞争方法通过最大程度地提高了增强量产生的表示形式之间的相互信息来解决无监督的表示学习。然后，随机的增强策略不变，可用于下游任务，例如聚类或分类。然而，数据增强保留了图像的许多属性，因此有可能依靠数据中易于找到的特征的代表选择。我们证明，贪婪或局部方法可以最大程度地提高相互信息（例如随机梯度优化），发现了互面信息标准的局部最佳选择。结果表示也不太理想地适合复杂的下游任务。较早的工作尚未明确识别或解决此问题。我们介绍了深层层次对象分组（DHOG），该对象分组（DHOG）以层次顺序计算图像的许多不同离散表示形式，最终生成更好地优化互信息目标的表示。我们还发现，这些表示形式与将基础对象类分组为基础的下游任务更好。我们在无监督的聚类上测试了DHOG，这是一个天然的下游测试，因为目标表示是数据的离散标记。我们在三个主要基准测试基准上取得了新的最新结果，而没有任何前所未有的预滤波或索贝尔 - 边缘检测，这对于许多以前的工作方法都是必要的。我们获得：CIFAR-10的准确性提高：CIFAR-10，CIFAR-100-20的准确性提高，SVHH的精度为1.5％，而SVHN的精度为7.2％。

Recently, a number of competitive methods have tackled unsupervised representation learning by maximising the mutual information between the representations produced from augmentations. The resulting representations are then invariant to stochastic augmentation strategies, and can be used for downstream tasks such as clustering or classification. Yet data augmentations preserve many properties of an image and so there is potential for a suboptimal choice of representation that relies on matching easy-to-find features in the data. We demonstrate that greedy or local methods of maximising mutual information (such as stochastic gradient optimisation) discover local optima of the mutual information criterion; the resulting representations are also less-ideally suited to complex downstream tasks. Earlier work has not specifically identified or addressed this issue. We introduce deep hierarchical object grouping (DHOG) that computes a number of distinct discrete representations of images in a hierarchical order, eventually generating representations that better optimise the mutual information objective. We also find that these representations align better with the downstream task of grouping into underlying object classes. We tested DHOG on unsupervised clustering, which is a natural downstream test as the target representation is a discrete labelling of the data. We achieved new state-of-the-art results on the three main benchmarks without any prefiltering or Sobel-edge detection that proved necessary for many previous methods to work. We obtain accuracy improvements of: 4.3% on CIFAR-10, 1.5% on CIFAR-100-20, and 7.2% on SVHN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题