论文标题
FAR:傅立叶航空视频识别
FAR: Fourier Aerial Video Recognition
论文作者
论文摘要
我们提出了无人机视频活动识别的算法,傅立叶活动识别(FAR)。我们的配方使用一种新颖的傅立叶对象分解方法将人类剂(通常很小)与背景区分开。我们的分离技术在频域中运行,以表征空间像素的时间变化的程度,并利用傅立叶变换的卷积 - 刺激性属性,以将此表示形式映射到从网络中获得的相应对象背景纠缠的特征。为了封装上下文信息和远程时空依赖性,我们提出了一种新颖的傅立叶注意算法,该算法通过对频域中的加权外产品进行建模来模仿自我注意的好处。我们的傅立叶注意力表述比自我注意力所使用的计算要少得多。我们已经在多个无人机数据集上评估了我们的方法,包括无人机人RGB,无人机人类夜,无人机动作和NEC无人机。我们证明,在前1位的准确性中,相对提高了8.02%-38.69%,并且在先前的工作中的相对提高了3倍。
We present an algorithm, Fourier Activity Recognition (FAR), for UAV video activity recognition. Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background. Our disentanglement technique operates in the frequency domain to characterize the extent of temporal change of spatial pixels, and exploits convolution-multiplication properties of Fourier transform to map this representation to the corresponding object-background entangled features obtained from the network. To encapsulate contextual information and long-range space-time dependencies, we present a novel Fourier Attention algorithm, which emulates the benefits of self-attention by modeling the weighted outer product in the frequency domain. Our Fourier attention formulation uses much fewer computations than self-attention. We have evaluated our approach on multiple UAV datasets including UAV Human RGB, UAV Human Night, Drone Action, and NEC Drone. We demonstrate a relative improvement of 8.02% - 38.69% in top-1 accuracy and up to 3 times faster over prior works.