Exploration Entropy for Reinforcement Learning

<div>Steps and EE Learning performance for (1) Q-learning with Softmax strategy and (2) PQL, respectively, in Maze A. (a) Softmax (Maze A). (b) PQL (Maze A).</div>

Mathematical Problems in Engineering

fig3

Figure 3

Figure 3: Exploration Entropy for Reinforcement Learning