高内容图像的元数据引导的一致性学习

论文标题

高内容图像的元数据引导的一致性学习

Metadata-guided Consistency Learning for High Content Images

论文作者

Haslum, Johan Fredin, Matsoukas, Christos, Leuchowius, Karl-Johan, Müllers, Erik, Smith, Kevin

论文摘要

高内容成像测定可以捕获大量复合处理的丰富表型反应数据，从而有助于新型药物的表征和发现。但是，从高素质图像中提取代表性特征可以捕获表型中微妙的细微差别仍然具有挑战性。缺乏高质量的标签使得很难通过有监督的深度学习获得令人满意的结果。自我监督的学习方法在自然图像上表现出了巨大的成功，并为显微镜图像提供了有吸引力的替代方法。但是，我们发现在高内容成像测定中，自我监督的学习技术表现不佳。一项挑战是被称为批处理效应的数据中存在的不良域移位，这些变化是由生物噪声或不受控制的实验条件引起的。为此，我们介绍了跨域一致性学习（CDCL），这是一种自制的方法，能够在批处理效应的情况下学习。 CDCL强制实施生物学相似性的学习，同时忽略不良批处理特定的信号，从而导致更有用的用途。这些特征是根据它们的形态变化组织的，对于下游任务（例如区分治疗和作用机理）更有用。

High content imaging assays can capture rich phenotypic response data for large sets of compound treatments, aiding in the characterization and discovery of novel drugs. However, extracting representative features from high content images that can capture subtle nuances in phenotypes remains challenging. The lack of high-quality labels makes it difficult to achieve satisfactory results with supervised deep learning. Self-Supervised learning methods have shown great success on natural images, and offer an attractive alternative also to microscopy images. However, we find that self-supervised learning techniques underperform on high content imaging assays. One challenge is the undesirable domain shifts present in the data known as batch effects, which are caused by biological noise or uncontrolled experimental conditions. To this end, we introduce Cross-Domain Consistency Learning (CDCL), a self-supervised approach that is able to learn in the presence of batch effects. CDCL enforces the learning of biological similarities while disregarding undesirable batch-specific signals, leading to more useful and versatile representations. These features are organised according to their morphological changes and are more useful for downstream tasks -- such as distinguishing treatments and mechanism of action.

下载PDF全文

下载文献需遵守相关版权规定

论文标题