Exploration Entropy for Reinforcement Learning

<div>Steps and EE Learning performance for (1) Q-learning with Softmax strategy and (2) PQL, respectively, in Maze B (a) Softmax (Maze B). (b) PQL (Maze B).</div>

Mathematical Problems in Engineering

fig4

Figure 4

Figure 4: Exploration Entropy for Reinforcement Learning