论文标题
多模式时间数据的积极获取:一项具有挑战性的决策任务
Active Acquisition for Multimodal Temporal Data: A Challenging Decision-Making Task
论文作者
论文摘要
我们介绍了一项具有挑战性的决策任务,我们称之为多模式时间数据(A2MT)的积极收购。在许多实际情况下,在测试时不容易获得输入功能,必须以巨大的成本获取。借助A2MT,我们旨在学习积极选择要获取的投入方式的代理商,从而减少收购成本和预测性能。 A2MT将一个名为Active功能采集的任务扩展到有关高维输入的时间决策。我们提出了一种基于感知器IO体系结构来解决A2MT的方法。我们的代理商能够解决一种新型的合成场景,需要实际相关的跨模式推理技能。在两个大规模的现实数据集(Kinetics-700)和Audioset上,我们的代理商成功地学习了成本反应性的获取行为。但是,消融表明他们无法学习自适应获取策略,即使是最先进的模型,也强调了任务的困难。 A2MT的应用可能会影响医学,机器人技术或金融领域,在这种领域,在获取成本和信息性方面的方式有所不同。
We introduce a challenging decision-making task that we call active acquisition for multimodal temporal data (A2MT). In many real-world scenarios, input features are not readily available at test time and must instead be acquired at significant cost. With A2MT, we aim to learn agents that actively select which modalities of an input to acquire, trading off acquisition cost and predictive performance. A2MT extends a previous task called active feature acquisition to temporal decision making about high-dimensional inputs. We propose a method based on the Perceiver IO architecture to address A2MT in practice. Our agents are able to solve a novel synthetic scenario requiring practically relevant cross-modal reasoning skills. On two large-scale, real-world datasets, Kinetics-700 and AudioSet, our agents successfully learn cost-reactive acquisition behavior. However, an ablation reveals they are unable to learn adaptive acquisition strategies, emphasizing the difficulty of the task even for state-of-the-art models. Applications of A2MT may be impactful in domains like medicine, robotics, or finance, where modalities differ in acquisition cost and informativeness.