论文标题

关于使用自动编码器在表示学习的分析:基础知识,学习任务案例研究,解释性和挑战

An analysis on the use of autoencoders for representation learning: fundamentals, learning task case studies, explainability and challenges

论文作者

Charte, David, Charte, Francisco, del Jesus, María J., Herrera, Francisco

论文摘要

在许多机器学习任务中,学习数据的良好表示可能是构建表现良好的解决方案的关键。这是因为大多数学习算法都具有这些功能,以找到数据的模型。例如,如果将数据映射到易于分离的空间,则分类性能可以改善,并且可以通过在功能空间中找到数据歧管来促进回归。通常,通过统计方法(例如主组件分析)或诸如ISOMAP或局部线性嵌入之类的流形学习技术进行统计方法进行转换。从大量的表示学习方法中,最通用的工具之一是自动编码器。在本文中,我们旨在演示如何影响其学到的表示形式以实现所需的学习行为。为此,我们提供了一系列的学习任务:可视化,图像denoising,语义散发,检测异常行为和实例生成的数据嵌入。我们从表示学习的角度对它们进行建模,并按照每个领域的艺术方法的状态。为使用自动编码器作为唯一学习方法的每个任务提出了解决方案。使用用于不同问题的数据集并实施每个解决方案的数据集进行了理论发展,然后对每个案例研究中的结果进行了讨论,并对其他六个学习应用程序进行了简要说明。我们还探讨了当前在自动编码器中解释性的挑战和方法。所有这些都有助于得出结论,由于其结构的变化以及其目标功能,自动编码器可能是解决许多问题的可能解决方案的核心,这些问题可以建立为特征空间的转换。

In many machine learning tasks, learning a good representation of the data can be the key to building a well-performant solution. This is because most learning algorithms operate with the features in order to find models for the data. For instance, classification performance can improve if the data is mapped to a space where classes are easily separated, and regression can be facilitated by finding a manifold of data in the feature space. As a general rule, features are transformed by means of statistical methods such as principal component analysis, or manifold learning techniques such as Isomap or locally linear embedding. From a plethora of representation learning methods, one of the most versatile tools is the autoencoder. In this paper we aim to demonstrate how to influence its learned representations to achieve the desired learning behavior. To this end, we present a series of learning tasks: data embedding for visualization, image denoising, semantic hashing, detection of abnormal behaviors and instance generation. We model them from the representation learning perspective, following the state of the art methodologies in each field. A solution is proposed for each task employing autoencoders as the only learning method. The theoretical developments are put into practice using a selection of datasets for the different problems and implementing each solution, followed by a discussion of the results in each case study and a brief explanation of other six learning applications. We also explore the current challenges and approaches to explainability in the context of autoencoders. All of this helps conclude that, thanks to alterations in their structure as well as their objective function, autoencoders may be the core of a possible solution to many problems which can be modeled as a transformation of the feature space.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源