视力任务多元化的动态路由

论文标题

视力任务多元化的动态路由

Diversified Dynamic Routing for Vision Tasks

论文作者

Csaba, Botos, Bibi, Adel, Li, Yanwei, Torr, Philip, Lim, Ser-Nam

论文摘要

在大型数据集上训练了视力任务的深度学习模型，假设存在通用表示，可用于对所有样本进行预测。尽管事实证明，高复杂性模型能够学习此类表示，但对数据的特定子集进行了培训的专家可以更有效地推断出标签。但是，使用专家的混合物提出了两个新问题，即（i）在提出新的看不见的样本时分配正确的专家。（ii）找到培训数据的最佳分区，以使专家最依赖于共同特征。在动态路由（DR）中，提出了一个新的体系结构，其中每层由一组专家组成，但是在没有解决这两个挑战的情况下，我们证明该模型可以恢复使用相同的专家子集。在我们的方法中，多元化的动态路由（DIVDR）明确训练了模型，以解决找到数据相关分区并以无监督的方法分配正确的专家的挑战。我们对MS-Coco上的城市景观和对象检测以及实例分割进行了几项实验，显示了几个基线的性能的提高。

Deep learning models for vision tasks are trained on large datasets under the assumption that there exists a universal representation that can be used to make predictions for all samples. Whereas high complexity models are proven to be capable of learning such representations, a mixture of experts trained on specific subsets of the data can infer the labels more efficiently. However using mixture of experts poses two new problems, namely (i) assigning the correct expert at inference time when a new unseen sample is presented. (ii) Finding the optimal partitioning of the training data, such that the experts rely the least on common features. In Dynamic Routing (DR) a novel architecture is proposed where each layer is composed of a set of experts, however without addressing the two challenges we demonstrate that the model reverts to using the same subset of experts. In our method, Diversified Dynamic Routing (DivDR) the model is explicitly trained to solve the challenge of finding relevant partitioning of the data and assigning the correct experts in an unsupervised approach. We conduct several experiments on semantic segmentation on Cityscapes and object detection and instance segmentation on MS-COCO showing improved performance over several baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题