GWA：用于音频处理的大型高质量的声学数据集

论文标题

GWA：用于音频处理的大型高质量的声学数据集

GWA: A Large High-Quality Acoustic Dataset for Audio Processing

论文作者

Tang, Zhenyu, Aralikatti, Rohith, Ratnarajah, Anton, Manocha, Dinesh

论文摘要

我们介绍了几何波声（GWA）数据集，这是一个大约200万个合成房间脉冲响应（IRS）及其相应的详细几何和仿真配置的大型音频数据集。我们的数据集样本的声学环境从6.8k高质量的多样性和专业设计的房屋中进行了示例，这些房屋代表着具有语义标记的3D网格。我们还基于使用句子变压器模型的语义匹配，提出了一种新颖的现实原声材料分配方案。我们通过使用有限的差异时域波求解器自动校准几何射线射线追踪来计算与准确的低频和高频波效应相对应的高质量冲动响应。我们通过与复杂的现实世界环境中记录的IRS进行比较来证明IRS的较高准确性。此外，我们强调了GWA对音频深度学习任务的好处，例如自动语音识别，语音增强和语音分离。该数据集是在复杂场景中具有准确波浪声模拟的第一个数据。代码和数据可在https://gamma.umd.edu/pro/sound/gwa上找到。

We present the Geometric-Wave Acoustic (GWA) dataset, a large-scale audio dataset of about 2 million synthetic room impulse responses (IRs) and their corresponding detailed geometric and simulation configurations. Our dataset samples acoustic environments from over 6.8K high-quality diverse and professionally designed houses represented as semantically labeled 3D meshes. We also present a novel real-world acoustic materials assignment scheme based on semantic matching that uses a sentence transformer model. We compute high-quality impulse responses corresponding to accurate low-frequency and high-frequency wave effects by automatically calibrating geometric acoustic ray-tracing with a finite-difference time-domain wave solver. We demonstrate the higher accuracy of our IRs by comparing with recorded IRs from complex real-world environments. Moreover, we highlight the benefits of GWA on audio deep learning tasks such as automated speech recognition, speech enhancement, and speech separation. This dataset is the first data with accurate wave acoustic simulations in complex scenes. Codes and data are available at https://gamma.umd.edu/pro/sound/gwa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题