Research Article
A Novel Reinforcement Learning Architecture for Continuous State and Action Spaces
Table 1
Comparison of the best policies for the dribbling problem.
| ā | SARSA | ()-learning |
| Algorithm type | Actor-Critic | ()-learning | Function approx. | RBFs | CMACs | States | Continuous | Continuous | Actions | Continuous | Discrete | Total learning time | 10 minutes | 24 hours 30 minutes | Average distance | 25.45 meters | 29.21 meters | Maximum distance | 36.23 meters | 39.0 meters |
|
|