Research Article

Reinforcement Learning for Routing in Cognitive Radio Ad Hoc Networks

Table 5

Simulation parameters and values for investigating the exploration approaches.

CategoryParameterValue

SUTraditional -greedy exploration probability, {0.07, 0.14}
Traditional softmax exploration temperature, {0.04, 0.05}
Initial dynamic softmax temperature, 0.05
Dynamic softmax adjustment factor, 0.01
Dynamic softmax temperature range, 0.01, 0.1
Dynamic softmax -value threshold, 0.1

PUStandard deviation of PUL, {0.2, 0.8}

ChannelMean PER, 0
Standard deviation of PER, 0