论文标题

快速嵌入到锤子立方体中

Fast metric embedding into the Hamming cube

论文作者

Dirksen, Sjoerd, Mendelson, Shahar, Stollenwerk, Alexander

论文摘要

我们考虑将$ \ Mathbb {r}^n $的子集嵌入低维锤地数据集中的问题。我们构建了一个简单的,数据的和计算高效的图,该图以很高的概率来完成此任务:我们首先应用一个特定的结构化随机矩阵,我们称之为双循环矩阵;使用该矩阵需要线性存储,并且可以在接近线性的时间内执行矩阵矢量乘法。然后,我们通过将每个矢量的每个条目与随机阈值进行比较,将每个矢量从选择的间隔中随机选择。 我们根据集合的两个自然几何复杂性参数来估算此编码方案所需的位数 - 其欧几里得覆盖数量及其局部高斯复杂性。事实证明,我们得出的估计是人们所希望的最好的 - 按对数术语。 证明的关键是一种独立关注的现象:我们表明,双循环矩阵以两种重要方式模仿了高斯矩阵的行为。首先,它将$ \ mathbb {r}^n $中的任意集映射到一组良好的向量中。其次,它可以将$ \ ell_2^n $的任何有限子集的快速近乎静电嵌入到$ \ ell_1^m $中。在要嵌入的点数上,这种嵌入的尺寸与高斯矩阵相同,在接近线性的时间内(直至对数因素)在要嵌入的点数上。这改善了由于Ailon和Chazelle的著名建筑。

We consider the problem of embedding a subset of $\mathbb{R}^n$ into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability: we first apply a specific structured random matrix, which we call the double circulant matrix; using that matrix requires linear storage and matrix-vector multiplication can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set - its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for - up to logarithmic terms. The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of a Gaussian matrix in two important ways. First, it maps an arbitrary set in $\mathbb{R}^n$ into a set of well-spread vectors. Second, it yields a fast near-isometric embedding of any finite subset of $\ell_2^n$ into $\ell_1^m$. This embedding achieves the same dimension reduction as a Gaussian matrix in near-linear time, under an optimal condition - up to logarithmic factors - on the number of points to be embedded. This improves a well-known construction due to Ailon and Chazelle.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源