论文标题
随机反向传播:训练视频模型的记忆有效策略
Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
论文作者
论文摘要
我们提出了一种名为随机反向传播(SBP)的内存效率方法,用于培训视频中的深层神经网络。这是基于这样的发现,即从不完整执行到反向传播的梯度仍然可以有效地以最小的精度损失训练模型,这归因于视频的高冗余。 SBP保持所有前进路径,但在每个训练步骤中随机和独立地删除每个网络层的向后路径。它通过消除与向后路径相对应的缓存激活值的需求来降低GPU存储器成本,该路径可以通过可调节的值来控制其数量。实验表明,可以将SBP应用于视频任务的各种模型,从而可节省80.0%的GPU内存和10%的训练速度,而动作识别和时间动作检测的准确度下降了不到1%。
We propose a memory efficient method, named Stochastic Backpropagation (SBP), for training deep neural networks on videos. It is based on the finding that gradients from incomplete execution for backpropagation can still effectively train the models with minimal accuracy loss, which attributes to the high redundancy of video. SBP keeps all forward paths but randomly and independently removes the backward paths for each network layer in each training step. It reduces the GPU memory cost by eliminating the need to cache activation values corresponding to the dropped backward paths, whose amount can be controlled by an adjustable keep-ratio. Experiments show that SBP can be applied to a wide range of models for video tasks, leading to up to 80.0% GPU memory saving and 10% training speedup with less than 1% accuracy drop on action recognition and temporal action detection.