用于声学事件分类的联合自我监督学习

论文标题

用于声学事件分类的联合自我监督学习

Federated Self-Supervised Learning for Acoustic Event Classification

论文作者

Feng, Meng, Kao, Chieh-Chi, Tang, Qingming, Sun, Ming, Rozgic, Viktor, Matsoukas, Spyros, Wang, Chao

论文摘要

标准声学事件分类（AEC）解决方案需要从客户端设备中大量的数据收集以进行模型优化。联合学习（FL）是一个引人注目的框架，可以解散数据收集和模型培训，以增强客户隐私。在这项工作中，我们调查了应用FL以提高AEC性能的可行性，而没有直接将客户数据上传到服务器。我们假设无法从设备用户输入中推断出任何伪标签，而是与AEC的典型用例一致。我们将自我监督的学习调整到FL框架中，以持续进行表示形式，并改善了下游AEC分类器的性能，无需标记/伪标记的数据。与W/O FL的基线相比，所提出的方法在维持召回的同时相对较高的精度高达20.3 \％。我们的工作与FL中的先前工作不同，因为我们的方法不需要用户生成的学习目标，并且我们使用的数据是从Beta程序中收集的并已被识别的，以最大程度地模拟生产设置。

Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization. Federated learning (FL) is a compelling framework that decouples data collection and model training to enhance customer privacy. In this work, we investigate the feasibility of applying FL to improve AEC performance while no customer data can be directly uploaded to the server. We assume no pseudo labels can be inferred from on-device user inputs, aligning with the typical use cases of AEC. We adapt self-supervised learning to the FL framework for on-device continual learning of representations, and it results in improved performance of the downstream AEC classifiers without labeled/pseudo-labeled data available. Compared to the baseline w/o FL, the proposed method improves precision up to 20.3\% relatively while maintaining the recall. Our work differs from prior work in FL in that our approach does not require user-generated learning targets, and the data we use is collected from our Beta program and is de-identified, to maximally simulate the production settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题