论文标题
稀疏标记的顺序数据的半监督学习:应用于医疗保健视频处理
Semi-Supervised Learning for Sparsely-Labeled Sequential Data: Application to Healthcare Video Processing
论文作者
论文摘要
标记的数据是培训和评估机器学习模型的关键资源。但是,许多现实生活数据集仅部分标记。我们提出了一个半监督的机器学习训练策略,以提高顺序数据(例如视频记录)的事件检测性能,当仅可用稀疏标签(例如事件启动时间)而没有相应的结束时间。我们的方法对事件的结束时间使用嘈杂的猜测来训练事件检测模型。根据这些猜测的保守程度,可能会引入错误标签的样本中。我们进一步提出了一个数学模型,用于解释和估计越来越多的嘈杂的结束时间估计的分类性能的演变。我们表明,尽管不正确的标签比例较高,但神经网络可以通过利用更多保守近似的训练数据来改善其检测性能。我们适应了CIFAR-10和MNIST的顺序版本,并使用Berkeley Mhad和HMBD51视频数据集对我们的方法进行经验评估,并发现我们的风险耐受性策略优于保守性估计的3.5点CIFAR的平均平均精度,MNIST的平均平均精度为30点,MNIST的30点,3点的MNIST,MHAD和14点14点,HMBD551。然后,我们利用拟议的培训策略来解决现实生活中的应用:处理癫痫患者的连续视频录制,并表明我们的方法的表现优于基线标签方法的平均精度为17点,并达到类似于完全监督模型的分类性能。我们分享本文的一部分代码。
Labeled data is a critical resource for training and evaluating machine learning models. However, many real-life datasets are only partially labeled. We propose a semi-supervised machine learning training strategy to improve event detection performance on sequential data, such as video recordings, when only sparse labels are available, such as event start times without their corresponding end times. Our method uses noisy guesses of the events' end times to train event detection models. Depending on how conservative these guesses are, mislabeled samples may be introduced into the training set. We further propose a mathematical model for explaining and estimating the evolution of the classification performance for increasingly noisier end time estimates. We show that neural networks can improve their detection performance by leveraging more training data with less conservative approximations despite the higher proportion of incorrect labels. We adapt sequential versions of CIFAR-10 and MNIST, and use the Berkeley MHAD and HMBD51 video datasets to empirically evaluate our method, and find that our risk-tolerant strategy outperforms conservative estimates by 3.5 points of mean average precision for CIFAR, 30 points for MNIST, 3 points for MHAD, and 14 points for HMBD51. Then, we leverage the proposed training strategy to tackle a real-life application: processing continuous video recordings of epilepsy patients, and show that our method outperforms baseline labeling methods by 17 points of average precision, and reaches a classification performance similar to that of fully supervised models. We share part of the code for this article.