论文标题
使用Gaia EDR3和Catwise2020的类星体和星系分类
Quasar and galaxy classification using Gaia EDR3 and CatWise2020
论文作者
论文摘要
在这项工作中,与使用GAIA数据获得的分类相比,我们评估了GAIA光度法和天文学的综合使用以及Catwise的红外数据的综合使用,以改善核外源的鉴定。我们评估了不同的输入特征配置和先验功能,目的是提出分类方法,该方法集成了宇宙中现实的类别分布的先验知识。在我们的工作中,我们将不同的分类器,即高斯混合模型(GMM),XGBoost和Catboost进行比较,并将源分为三个类 - Star,Quasar和Galaxy,与从SDSSS16获得的目标Quasar和Galaxy类标签,以及从Gaia Edr3中获得的SDSS16。在我们的方法中,我们调整了后验概率,以通过先前的功能反映宇宙中宇宙外源的内在分布。我们介绍了两个先验,这是一个全球先验,反映了类星体和星系的整体稀有性,而混合的先验则结合了这些来源的分布,这些分布是银河纬度和幅度的函数。就星系和类星体类的完整性和纯度而言,我们最好的分类性能是使用高纬度源和幅度范围G = 18.5至19.5的混合先验来实现的。我们将确定的表现最佳分类器应用于Gaia DR3的三个应用程序数据集,并发现与混合先验相比,全局先验更为保守。特别是,当应用于纯的类星体和星系候选样品时,我们使用全球先验的类星体的纯度为97%,星系的纯度为99.9%,使用混合先验的纯度和96%和99%的纯度。我们通过讨论应用调整后的先验的重要性来结束我们的工作,描绘了宇宙中现实的阶级分布的重要性。
In this work, we assess the combined use of Gaia photometry and astrometry with infrared data from CatWISE in improving the identification of extragalactic sources compared to the classification obtained using Gaia data. We evaluate different input feature configurations and prior functions, with the aim of presenting a classification methodology integrating prior knowledge stemming from realistic class distributions in the universe. In our work, we compare different classifiers, namely Gaussian Mixture Models (GMMs), XGBoost and CatBoost, and classify sources into three classes - star, quasar, and galaxy, with the target quasar and galaxy class labels obtained from SDSS16 and the star label from Gaia EDR3. In our approach, we adjust the posterior probabilities to reflect the intrinsic distribution of extragalactic sources in the universe via a prior function. We introduce two priors, a global prior reflecting the overall rarity of quasars and galaxies, and a mixed prior that incorporates in addition the distribution of the these sources as a function of Galactic latitude and magnitude. Our best classification performances, in terms of completeness and purity of the galaxy and quasar classes, are achieved using the mixed prior for sources at high latitudes and in the magnitude range G = 18.5 to 19.5. We apply our identified best-performing classifier to three application datasets from Gaia DR3, and find that the global prior is more conservative in what it considers to be a quasar or a galaxy compared to the mixed prior. In particular, when applied to the pure quasar and galaxy candidates samples, we attain a purity of 97% for quasars and 99.9% for galaxies using the global prior, and purities of 96% and 99% respectively using the mixed prior. We conclude our work by discussing the importance of applying adjusted priors portraying realistic class distributions in the universe.