Research Article

Exploration Entropy for Reinforcement Learning

Figure 3

Steps and EE Learning performance for (1) Q-learning with Softmax strategy and (2) PQL, respectively, in Maze A. (a) Softmax (Maze A). (b) PQL (Maze A).
(a)
(b)