论文标题

用于投影聚类和应用的新核心

New Coresets for Projective Clustering and Applications

论文作者

Tukan, Murad, Wu, Xuan, Zhou, Samson, Braverman, Vladimir, Feldman, Dan

论文摘要

$(j,k)$ - 投影聚类是$ k $ - 簇和$ j $ -subspace聚类问题的自然概括。给定一组点$ p $ in $ \ mathbb {r}^d $,目标是找到$ k $ dimension $ j $的$ k $ flats,即仿射子空间,在给定距离测量下,最适合$ p $。在本文中,我们提出了第一种返回$ l_ \ infty $ coreset多项式中$ d $的算法。此外,我们为一般$ M $估计器回归提供了第一个强大的核心结构。具体来说,我们表明我们的构造为Cauchy,Welsch,Huber,Geman-McClure,Tukey,Tukey,$ L_1-L_2 $和公平回归提供了有效的核心结构,以及一般的凹面和电力型损失功能。最后,我们根据现实世界数据集提供了实验结果,显示了我们方法的功效。

$(j,k)$-projective clustering is the natural generalization of the family of $k$-clustering and $j$-subspace clustering problems. Given a set of points $P$ in $\mathbb{R}^d$, the goal is to find $k$ flats of dimension $j$, i.e., affine subspaces, that best fit $P$ under a given distance measure. In this paper, we propose the first algorithm that returns an $L_\infty$ coreset of size polynomial in $d$. Moreover, we give the first strong coreset construction for general $M$-estimator regression. Specifically, we show that our construction provides efficient coreset constructions for Cauchy, Welsch, Huber, Geman-McClure, Tukey, $L_1-L_2$, and Fair regression, as well as general concave and power-bounded loss functions. Finally, we provide experimental results based on real-world datasets, showing the efficacy of our approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源