Multi-Agent Spacecraft Docking with Reinforcement Learning
DOI:
https://doi.org/10.31224/5877Keywords:
Multi-Agent Systems, Reinforcement Learning, autonomous docking, close-proximity spacecraftAbstract
This research explores the application of Multi-Agent Reinforcement Learning (MARL) to a cooperative spacecraft docking problem with three chaser spacecraft and a lightly tumbling target. The study extends Proximal Policy Optimisation (PPO) based single-agent reinforcement learning (RL) docking to multi-agent spacecraft using Multi-Agent Proximal Policy Optimisation (MAPPO), targeting a simple 3-DOF planar setup. The primary goal is to develop a safe, decentralised docking policy that accounts for low-thrust constraints, fuel efficiency, and inter-agent communication limitations. Agents are only given access to the noisy measurement of the nearest agent to mimic large constellations. The policy is trained with a reward function that penalises position, velocity, fuel consumption, and angular errors while encouraging successful docking and collision avoidance. Experimental validation is conducted via Monte Carlo simulations. Results demonstrate the feasibility of applying MARL to spacecraft docking tasks, achieving a 99.1% docking success rate in simulation. The research highlights the potential of reinforcement learning approaches for future distributed multi-agent space missions. However, further work is needed to address robustness concerns and optimise the policy for more complex scenarios and a large number of agents.
Downloads
Downloads
Posted
License
Copyright (c) 2025 Selim Olgu Pilav

This work is licensed under a Creative Commons Attribution 4.0 International License.