可共享的合成数据生成的受限生成的对抗网络集合

论文标题

可共享的合成数据生成的受限生成的对抗网络集合

Constrained Generative Adversarial Network Ensembles for Sharable Synthetic Data Generation

论文作者

Dikici, Engin, Prevedello, Luciano M., Bigelow, Matthew, White, Richard D., Erdal, Barbaros Selnur

论文摘要

在机构之间甚至在同一机构内部的医学成像数据集共享受到各种法规/法律障碍的限制。尽管这些限制是保护患者隐私并为数据所有权设定严格的界限的必要性，但需要大量数据集的医学研究项目因此很大。近年来，随着新兴的深层神经网络方法的发展，机器学习已经彻底改变，这使得与数据相关的限制甚至是一个更大的问题，因为这些新型技术通常需要巨大的成像数据集。本文介绍了受约束的生成对抗网络集合（CGANE），通过更改成像数据的表示来解决此问题，而包含重要信息，从而在其他地方使用可共享的数据来复制类似的研究结果。因此，描述了代表CGANE产生的框架，该方法已通过T1加权对比增强MRI研究的合成3D脑转移区域数据进行验证。对于90％的脑转移（BM）检测敏感性，我们先前报道的检测算法在使用原始数据训练后，平均每位患者产生9.12个假阳性BM检测，而在使用CGANE生成的合成数据训练后产生9.53个假阳性。尽管引入方法的适用性需要使用一系列医学成像数据类型进行进一步的验证研究，但结果表明，BM检测算法可以通过使用CGANE生成的合成数据实现可比的性能。因此，在不久的将来可能会发生拟议方法的概括。

The sharing of medical imaging datasets between institutions, and even inside the same institution, is limited by various regulations/legal barriers. Although these limitations are necessities for protecting patient privacy and setting strict boundaries for data ownership, medical research projects that require large datasets suffer considerably as a result. Machine learning has been revolutionized with the emerging deep neural network approaches over recent years, making the data-related limitations even a larger problem as these novel techniques commonly require immense imaging datasets. This paper introduces constrained Generative Adversarial Network ensembles (cGANe) to address this problem by altering the representation of the imaging data, whereas containing the significant information, enabling the reproduction of similar research results elsewhere with the sharable data. Accordingly, a framework representing the generation of a cGANe is described, and the approach is validated for the generation of synthetic 3D brain metastatic region data from T1-weighted contrast-enhanced MRI studies. For 90% brain metastases (BM) detection sensitivity, our previously reported detection algorithm produced on average 9.12 false-positive BM detections per patient after training with the original data, whereas producing 9.53 false-positives after training with the cGANe generated synthetic data. Although the applicability of the introduced approach needs further validation studies with a range of medical imaging data types, the results suggest that the BM-detection algorithm can achieve comparable performance by using cGANe generated synthetic data. Hence, the generalization of the proposed approach for various modalities may occur in the near future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题