论文标题
一项关于聚类预识别嵌入的实证研究:严格严格好吗?
An Empirical Study on Clustering Pretrained Embeddings: Is Deep Strictly Better?
论文作者
论文摘要
最近在聚类面部嵌入的研究发现,无监督的,浅的,基于启发式的方法(包括$ k $ - ameans和层次结构聚集聚类) - 表现不佳的监督,深层,归纳的方法。尽管报告的改进确实令人印象深刻,但实验大部分仅限于面对数据集,在该数据集中,聚类的嵌入具有高度歧视性或通过班级分离的高度歧视性(回想起@1以上的90%,通常在天花板上接近天花板),而实验方法似乎偏爱深度方法。我们对三个数据集的17种聚类方法进行了大规模的经验研究,并获得了一些强大的发现。值得注意的是,对于具有更多不确定性的嵌入,它们匹配甚至比浅层基于启发式的方法匹配甚至更差的嵌入方式令人惊讶地脆弱。当嵌入高度歧视时,深度方法确实胜过基准,与过去的结果一致,但是方法之间的余量比以前报道的小得多。我们认为,我们的基准扩大了面部领域以外的监督聚类方法的范围,并可以作为可以改进这些方法的基础。为了启用可重复性,我们在附录中包括所有必要的详细信息,并计划发布代码。
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods -- including $k$-means and hierarchical agglomerative clustering -- underperform supervised, deep, inductive methods. While the reported improvements are indeed impressive, experiments are mostly limited to face datasets, where the clustered embeddings are highly discriminative or well-separated by class (Recall@1 above 90% and often nearing ceiling), and the experimental methodology seemingly favors the deep methods. We conduct a large-scale empirical study of 17 clustering methods across three datasets and obtain several robust findings. Notably, deep methods are surprisingly fragile for embeddings with more uncertainty, where they match or even perform worse than shallow, heuristic-based methods. When embeddings are highly discriminative, deep methods do outperform the baselines, consistent with past results, but the margin between methods is much smaller than previously reported. We believe our benchmarks broaden the scope of supervised clustering methods beyond the face domain and can serve as a foundation on which these methods could be improved. To enable reproducibility, we include all necessary details in the appendices, and plan to release the code.