Research Article

Exploration Entropy for Reinforcement Learning

Figure 4

Steps and EE Learning performance for (1) Q-learning with Softmax strategy and (2) PQL, respectively, in Maze B (a) Softmax (Maze B). (b) PQL (Maze B).
(a)
(b)