论文标题
多分辨率在线确定性退火:层次结构和进步的学习体系结构
Multi-Resolution Online Deterministic Annealing: A Hierarchical and Progressive Learning Architecture
论文作者
论文摘要
逐渐近似于数据驱动的优化问题的层次学习算法对于决策系统至关重要,尤其是在时间和计算资源的限制下。在这项研究中,我们介绍了一种通用分层学习体系结构,该结构基于可能多分辨率数据空间的进行性分区。通过求解一系列优化子问题,该序列产生一系列分区,并逐渐近似最佳分区。我们表明,可以使用无梯度随机近似更新在线估算每个优化问题的解决方案。结果,可以在分区的每个子集中定义函数近似问题,并使用两种量表随机近似算法的理论解决。这模拟了退火过程,并定义了一种强大且可解释的启发式方法,以逐渐以任务不可能的方式增加学习体系结构的复杂性,从而强调了根据预定义的标准认为更重要的数据空间区域。最后,通过在分区的进展中施加树结构,我们提供了一种将数据空间的潜在多分辨率结构纳入这种方法的方法,从而大大降低了其复杂性,同时引入了类似于某些深度学习体系结构类别的层次变量速率提取性能。为监督和无监督的学习问题提供了渐近收敛分析和实验结果。
Hierarchical learning algorithms that gradually approximate a solution to a data-driven optimization problem are essential to decision-making systems, especially under limitations on time and computational resources. In this study, we introduce a general-purpose hierarchical learning architecture that is based on the progressive partitioning of a possibly multi-resolution data space. The optimal partition is gradually approximated by solving a sequence of optimization sub-problems that yield a sequence of partitions with increasing number of subsets. We show that the solution of each optimization problem can be estimated online using gradient-free stochastic approximation updates. As a consequence, a function approximation problem can be defined within each subset of the partition and solved using the theory of two-timescale stochastic approximation algorithms. This simulates an annealing process and defines a robust and interpretable heuristic method to gradually increase the complexity of the learning architecture in a task-agnostic manner, giving emphasis to regions of the data space that are considered more important according to a predefined criterion. Finally, by imposing a tree structure in the progression of the partitions, we provide a means to incorporate potential multi-resolution structure of the data space into this approach, significantly reducing its complexity, while introducing hierarchical variable-rate feature extraction properties similar to certain classes of deep learning architectures. Asymptotic convergence analysis and experimental results are provided for supervised and unsupervised learning problems.