This is an outdated version published on 2024-12-26. Read the most recent version.

Preprint / Version 1

Learning coverage paths in unknown environments with deep reinforcement learning

##article.authors##

Tianyao Zheng
Yuhui Jin
Haopeng Zhao
Zhichao Ma
Yongzhou Chen
Kunpeng Xu

DOI:

https://doi.org/10.31224/4260

Keywords:

deep reinforcement learning, coverage planning, path planning

Abstract

The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm offers a robust solution for the coverage path planning problem, where a robot must effectively and efficiently cover a designated area, ensuring minimal redundancy and maximum coverage. Traditional methods for path planning often lack the adaptability required for dynamic and unstructured environments. In contrast, TD3 utilizes twin Q-networks to reduce overestimation bias, delayed policy updates for increased stability, and target policy smoothing to maintain smooth transitions in the robot's path. These features allow the robot to learn an optimal path strategy in real-time, effectively balancing exploration and exploitation. This paper explores the application of TD3 to coverage path planning, demonstrating that it enables a robot to adaptively and efficiently navigate complex coverage tasks, showing significant advantages over conventional methods in terms of coverage rate, total length, and adaptability.

Downloads

Download data is not yet available.

Downloads

Posted

2024-12-26

Versions

2025-01-15 (2)
2024-12-26 (1)

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Learning coverage paths in unknown environments with deep reinforcement learning

##article.authors##

DOI:

Keywords:

Abstract

Downloads

Downloads

Posted

Versions

License

Latest preprints