论文标题
迈向微流体单细胞培养中CHO-K1悬浮液增长的自动分析
Towards an Automatic Analysis of CHO-K1 Suspension Growth in Microfluidic Single-cell Cultivation
论文作者
论文摘要
动机:创新的微流体系统具有希望在明确定义的环境条件下极大地促进单个细胞的时空分析,从而使人们对种群异质性有新的见解,并为基本和应用生物技术的新机会开辟了新的机会。然而,微流体实验伴随着大量数据,例如显微镜图像的时间序列,由于样品数量的数量,手动评估是不可行的。尽管经典的图像处理技术并没有在该领域带来令人满意的结果,但现代深度学习技术(例如卷积网络)对于各种任务,包括自动细胞跟踪和计数以及提取关键参数(例如增长率)的多种用途。但是,为了成功培训,当前有监督的深度学习需要标签信息,例如系列中每个图像的细胞数量或位置;在这种情况下,获得这些注释非常昂贵。结果:我们提出了一种新型的机器学习体系结构以及专门的培训程序,这使我们能够在数据水平上注入深层神经网络,并在数据水平上注入以人驱动的抽象,从而导致高性能的回归模型,仅需要少量的标记数据。具体而言,我们同时训练了自然和合成数据的生成模型,因此它可以可靠地估计目标变量(例如细胞计数)的共享表示形式。
Motivation: Innovative microfluidic systems carry the promise to greatly facilitate spatio-temporal analysis of single cells under well-defined environmental conditions, allowing novel insights into population heterogeneity and opening new opportunities for fundamental and applied biotechnology. Microfluidics experiments, however, are accompanied by vast amounts of data, such as time series of microscopic images, for which manual evaluation is infeasible due to the sheer number of samples. While classical image processing technologies do not lead to satisfactory results in this domain, modern deep learning technologies such as convolutional networks can be sufficiently versatile for diverse tasks, including automatic cell tracking and counting as well as the extraction of critical parameters, such as growth rate. However, for successful training, current supervised deep learning requires label information, such as the number or positions of cells for each image in a series; obtaining these annotations is very costly in this setting. Results: We propose a novel Machine Learning architecture together with a specialized training procedure, which allows us to infuse a deep neural network with human-powered abstraction on the level of data, leading to a high-performing regression model that requires only a very small amount of labeled data. Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.