归一化的跨密度功能：量化随机过程统计依赖性的框架

论文标题

归一化的跨密度功能：量化随机过程统计依赖性的框架

The Normalized Cross Density Functional: A Framework to Quantify Statistical Dependence for Random Processes

论文作者

Hu, Bo, Principe, Jose C.

论文摘要

本文提出了一种新的方法，用于使用称为归一化跨密度（NCD）的正定函数来测量两个随机过程（R.P.）之间的统计依赖性。 NCD直接源自两个R.P.的概率密度函数。并构建一个依赖数据的希尔伯特空间，即标准化的跨密度希尔伯特空间（NCD-HS）。根据Mercer的定理，可以将NCD规范分解为其特征光谱，我们将其命名为多元统计依赖性（MSD）度量及其总和，总依赖度度量（TSD）。因此，NCD-HS本征函数是一个新型的嵌入式特征空间，适合量化R.P.统计依赖性。为了直接将NCD应用于R.P.实现，我们介绍了一个带有两个多发性神经网络的体系结构，一个成本函数和一种名为功能最大相关算法（FMCA）的算法。使用FMCA，两个网络通过近似彼此的输出同时学习，从而扩展了多元函数的交替条件期望（ACE）。我们从数学上证明FMCA直接从实现中学习了NCD的主要特征值和特征功能。合成数据和中型图像数据集的初步结果证实了这一理论。提出和讨论了应用NCD的不同策略，证明了该方法的多功能性和稳定性超出了监督学习。具体而言，当两个R.P. FMCA是高维实际图像和白色均匀的噪声过程，即阶乘代码，即，代码的出现确保存在特定的训练集图像，这对于特征学习很重要。

This paper presents a novel approach to measuring statistical dependence between two random processes (r.p.) using a positive-definite function called the Normalized Cross Density (NCD). NCD is derived directly from the probability density functions of two r.p. and constructs a data-dependent Hilbert space, the Normalized Cross-Density Hilbert Space (NCD-HS). By Mercer's Theorem, the NCD norm can be decomposed into its eigenspectrum, which we name the Multivariate Statistical Dependence (MSD) measure, and their sum, the Total Dependence Measure (TSD). Hence, the NCD-HS eigenfunctions serve as a novel embedded feature space, suitable for quantifying r.p. statistical dependence. In order to apply NCD directly to r.p. realizations, we introduce an architecture with two multiple-output neural networks, a cost function, and an algorithm named the Functional Maximal Correlation Algorithm (FMCA). With FMCA, the two networks learn concurrently by approximating each other's outputs, extending the Alternating Conditional Expectation (ACE) for multivariate functions. We mathematically prove that FMCA learns the dominant eigenvalues and eigenfunctions of NCD directly from realizations. Preliminary results with synthetic data and medium-sized image datasets corroborate the theory. Different strategies for applying NCD are proposed and discussed, demonstrating the method's versatility and stability beyond supervised learning. Specifically, when the two r.p. are high-dimensional real-world images and a white uniform noise process, FMCA learns factorial codes, i.e., the occurrence of a code guarantees that a specific training set image was present, which is important for feature learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题