This is an outdated version published on 2024-12-26. Read the most recent version.
Preprint / Version 1

Learning coverage paths in unknown environments with deep reinforcement learning

##article.authors##

  • Tianyao Zheng
  • Yuhui Jin
  • Haopeng Zhao
  • Zhichao Ma
  • Yongzhou Chen
  • Kunpeng Xu

DOI:

https://doi.org/10.31224/4260

Keywords:

deep reinforcement learning, coverage planning, path planning

Abstract

The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm offers a robust solution for the coverage path planning problem, where a robot must effectively and efficiently cover a designated area, ensuring minimal redundancy and maximum coverage. Traditional methods for path planning often lack the adaptability required for dynamic and unstructured environments. In contrast, TD3 utilizes twin Q-networks to reduce overestimation bias, delayed policy updates for increased stability, and target policy smoothing to maintain smooth transitions in the robot's path. These features allow the robot to learn an optimal path strategy in real-time, effectively balancing exploration and exploitation. This paper explores the application of TD3 to coverage path planning, demonstrating that it enables a robot to adaptively and efficiently navigate complex coverage tasks, showing significant advantages over conventional methods in terms of coverage rate, total length, and adaptability.

Downloads

Download data is not yet available.

Downloads

Posted

2024-12-26

Versions