论文标题
在嘈杂数据下进行张量列车分解和完成,并具有先验分析和等级估计
Tensor Train Factorization and Completion under Noisy Data with Prior Analysis and Rank Estimation
论文作者
论文摘要
张量火车(TT)分解是一种用于分析多维数据的强大工具,在许多机器学习任务中表现出卓越的性能。但是,现有的TT分解方法要么具有噪音过度拟合,要么需要对模型复杂性和表示准确性之间的平衡进行广泛的微调。在本文中,通过赋予自动排名确定的能力,采用了完全贝叶斯对TT分解的处理来避免噪声过度拟合。特别是,在诱导TT岩心的切片上诱导稀疏性之前,建立了理论证据,以采用高斯产物 - 伽玛,以便即使在不完整且嘈杂的观察到的数据下,模型的复杂性也会自动确定。此外,基于提出的概率模型,有效的学习算法是在变异推理框架下得出的。合成数据的仿真结果表明,从不完整的嘈杂数据中恢复了地面真相TT结构方面所提出的模型和算法的成功。与其他现有的TT分解算法相比,对现实世界数据的进一步实验表明,所提出的算法在图像完成和图像分类方面的性能更好。
Tensor train (TT) decomposition, a powerful tool for analyzing multidimensional data, exhibits superior performance in many machine learning tasks. However, existing methods for TT decomposition either suffer from noise overfitting, or require extensive fine-tuning of the balance between model complexity and representation accuracy. In this paper, a fully Bayesian treatment of TT decomposition is employed to avoid noise overfitting, by endowing it with the ability of automatic rank determination. In particular, theoretical evidence is established for adopting a Gaussian-product-Gamma prior to induce sparsity on the slices of the TT cores, so that the model complexity is automatically determined even under incomplete and noisy observed data. Furthermore, based on the proposed probabilistic model, an efficient learning algorithm is derived under the variational inference framework. Simulation results on synthetic data show the success of the proposed model and algorithm in recovering the ground-truth TT structure from incomplete noisy data. Further experiments on real-world data demonstrate the proposed algorithm performs better in image completion and image classification, compared to other existing TT decomposition algorithms.