近期,实验室博士生周栋为第一作者,实验室孙光辉教授作为通讯作者的论文“On Deep Recurrent Reinforcement Learning for Active Visual Tracking of Space Noncooperative Objects”已被机器人领域权威期刊IEEE Robotics and Automation Letters录用。
仅利用可见光相机实现空间非合作目标主动视觉跟踪,对于航天器自主交会对接、空间碎片清除等在轨任务来说意义重大。考虑到该任务的部分可观马尔科夫决策过程(POMDP)性质,本文结合多头注意力机制(MHA)模块和压缩-激励(SE)网络层,提出了一种新颖的深度循环神经网络架构RAMAVT。该模型能够在几乎不增加额外计算负担的情况下显著提升神经网络的表征能力。RAMAVT被成功应用于基于值函数和基于策略梯度的深度强化学习方法,并学会以高频近似最优的速度控制指令,驱使航天器跟踪任意空间非合作目标。本文在SNCOAT基准上进行了充分的评估实验和消融研究,证明了本文方法相较于其他先进算法的有效性和鲁棒性。本文算法将开源于:https://github.com/Dongzhou-1996/RAMAVT。
Abstract
Active tracking of space noncooperative object that merely relies on vision camera is greatly significant for autonomous rendezvous and debris removal. Considering its Partial Observable Markov Decision Process (POMDP) property, this paper proposes a novel tracker based on deep recurrent reinforcement learning, named as RAMAVT which drives the chasing spacecraft to follow arbitrary space noncooperative object with high-frequency and near-optimal velocity control commands. To further improve the active tracking performance, we introduce Multi-Head Attention (MHA) module and Squeeze-and-Excitation (SE) layer into RAMAVT, which remarkably improve the representative ability of neural network with almost no extra computational cost. Extensive experiments and ablation study implemented on SNCOAT benchmark show the effectiveness and robustness of our method compared with other state-of-the-art algorithm. The source codes are available on https://github.com/Dongzhou-1996/RAMAVT.