近期,实验室博士生周栋为第一作者,实验室孙光辉教授作为通讯作者的论文“Space Non-cooperative Object Active Tracking with Deep Reinforcement Learning”已被航天航空领域权威期刊IEEE Transactions on Aerospace and Electronic Systems录用。
仅依赖于视觉相机对任意空间非合作目标进行主动跟踪仍然是一个极具挑战性的任务。为了推动该领域的发展,尤其是基于深度强化学习的方法,该文提供了一个开源的空间非合作目标主动视觉跟踪评估基准。它包括了虚拟仿真环境、评估工具集以及一个新颖的基于位置视觉伺服的基准算法。同时,该文提出了航天航空领域首个基于深度强化学习的主动视觉智能体DRLAVT。该智能体仅利用可见光图像或者RGBD图像便可学习到近似最优的跟踪策略。实验结果表明,相较于评估基准算法,DRLAVT通过复杂神经网络以及高效奖励函数的设计,能够取得非常优秀的鲁棒性和实时性。此外,该文所采用的多目标训练策略通过迫使智能体学习与目标运动模式相关的最优控制策略,有效保证了DRLAVT智能体的可迁移能力。
Abstract
Actively tracking an arbitrary space non-cooperative object relied on visual sensor remains a challenging problem. In this paper, we provide an open-source benchmark for space non-cooperative object visual tracking including simulated environment, evaluation toolkit, and a PBVS baseline algorithm, which can facilitate the research in this topic especially for those methods based on deep reinforcement learning. We also present an end-to-end active visual tracker based on deep Q-learning, named as DRLAVT, which learns approximately optimal policy merely took color or RGBD images as input. To the best of our knowledge, it is the first intelligent agent used for active visual tracking in aerospace domain. The experiment results show that our DRLAVT achieves excellent robustness and real-time performance compared with the PBVS baseline, benefitted from the design of complex neural network and efficient reward function. In addition, the multiple targets training adopted in this paper effectively guarantees the transferability of DRLAVT by forcing the agent to learn optimal control policy with respect to motion patterns of the target.