Li Jiaxiang, Chen Hao, Huang Jian, Zhang Zhongjie. HEURISTIC ACCELERATED DEEP Q NETWORK BASED ON COGNITIVE ACTION MODEL[J]. Computer Applications and Software, 2024, 41(9): 148-155. DOI: 10.3969/j.issn.1000-386x.2024.09.022
Citation: Li Jiaxiang, Chen Hao, Huang Jian, Zhang Zhongjie. HEURISTIC ACCELERATED DEEP Q NETWORK BASED ON COGNITIVE ACTION MODEL[J]. Computer Applications and Software, 2024, 41(9): 148-155. DOI: 10.3969/j.issn.1000-386x.2024.09.022

HEURISTIC ACCELERATED DEEP Q NETWORK BASED ON COGNITIVE ACTION MODEL

  • Due to the expansion of the state-action space or sparse rewards of the complex environment, it is more difficult for reinforcement learning agents to learn an optimal policy from scratch. Therefore, a cognitive behavior model-based heuristic accelerated deep Q network is proposed. It incorporated symbolic rules into the learning network and guided policy learning dynamically, which solved the problem of effectively accelerating agents learning. The algorithm modeled the heuristic knowledge as a BDI-based cognitive behavior model, which was used to generate cognitive behavior knowledge to guide the agents strategy learning. The heuristic strategy network was designed to guide the agents action selection online. Experiments in GYMs typical environment and StarCraft II environment show that the algorithm can dynamically extract effective cognitive behavior knowledge according to environmental changes, and accelerate the agent strategy convergence with the help of heuristic strategy network.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return