论文标题
顺序线性匪徒中的非平稳表示学习
Non-Stationary Representation Learning in Sequential Linear Bandits
论文作者
论文摘要
在本文中,我们研究了在非任务环境中多任务决策的代表性学习。我们考虑顺序线性匪徒的框架,其中代理执行一系列由与不同环境相关的不同集合绘制的任务。任务的嵌入每个集合共享一个称为表示形式的低维特征提取器,并且在整个集合之间的表示不同。我们提出了一种在线算法,该算法通过以自适应方式学习和转移非平稳表示来促进有效的决策。我们证明,我们的算法大大优于独立处理任务的现有算法。我们还使用合成数据和实际数据进行实验来验证我们的理论见解并证明我们的算法的功效。
In this paper, we study representation learning for multi-task decision-making in non-stationary environments. We consider the framework of sequential linear bandits, where the agent performs a series of tasks drawn from distinct sets associated with different environments. The embeddings of tasks in each set share a low-dimensional feature extractor called representation, and representations are different across sets. We propose an online algorithm that facilitates efficient decision-making by learning and transferring non-stationary representations in an adaptive fashion. We prove that our algorithm significantly outperforms the existing ones that treat tasks independently. We also conduct experiments using both synthetic and real data to validate our theoretical insights and demonstrate the efficacy of our algorithm.