Abstract:
In robot trajectory planning, deep reinforcement learning (DRL) based methods often suffer from the low learning efficiency and the problem of locally optimal solution. To cope with the defects above, a curiosity network and a modified optimization framework action-critic-curiosity (A-C-C) are proposed. A-C-C enabled the agent considering the problems more human-like, and made it pay more attentions to the process of exploration than the result. By promoting the exploration of unknown regions, A-C-C effectively improved the learning efficiency of DLR method and avoided local optimal solutions.The experiment results show that the proposed method can be combined with different reward functions to accelerate exploration efficiency by 43.6%-101.2%. The mean convergence is also improved by 4.8%-6.4%.