合奏的教师学习方法，带有泊松子采样到差异隐私保护语音识别

论文标题

合奏的教师学习方法，带有泊松子采样到差异隐私保护语音识别

An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition

论文作者

Yang, Chao-Han Huck, Qi, Jun, Siniscalchi, Sabato Marco, Lee, Chin-Hui

论文摘要

我们提出了一个带有泊松子采样的合奏学习框架，以有效地培训一系列教师模型，以发行一些不同的隐私（DP）保证培训数据。通过在DP下的提升，从培训数据中得出的学生模型几乎不会从没有隐私保护的训练的模型中降级。我们提出的解决方案利用了两种机制，即：（i）通过Poisson子采样进行隐私预算放大，以训练目标预测模型，该模型需要更少的噪音才能达到相同水平的隐私预算，（ii）通过子整体学习框架的组合，通过教师学习框架来介绍dp-dp-prers dp-prers dp-prers dp-prers dp-prers dp-prers dp-prers dp-prers dp-prers dp-prers dp-preser的噪声嘈杂的标签。然后，使用嘈杂的标签对保护隐私的学生模型进行培训，以通过教师模型合奏从DP保护学习知识。关于口语命令的实验证据和对普通话语音的持续语音识别的实验证据表明，我们提出的框架在两个语音处理任务中都极大地优于现有的DP传播算法。

We propose an ensemble learning framework with Poisson sub-sampling to effectively train a collection of teacher models to issue some differential privacy (DP) guarantee for training data. Through boosting under DP, a student model derived from the training data suffers little model degradation from the models trained with no privacy protection. Our proposed solution leverages upon two mechanisms, namely: (i) a privacy budget amplification via Poisson sub-sampling to train a target prediction model that requires less noise to achieve a same level of privacy budget, and (ii) a combination of the sub-sampling technique and an ensemble teacher-student learning framework that introduces DP-preserving noise at the output of the teacher models and transfers DP-preserving properties via noisy labels. Privacy-preserving student models are then trained with the noisy labels to learn the knowledge with DP-protection from the teacher model ensemble. Experimental evidences on spoken command recognition and continuous speech recognition of Mandarin speech show that our proposed framework greatly outperforms existing DP-preserving algorithms in both speech processing tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题