Research Article

An Empirical Investigation of Transfer Effects for Reinforcement Learning

Algorithm 2

The algorithm for training the non-transfer and transfer RL methods.
input: Straining, TRQn−1[Sn−1, An−1]
(1) initialize
(2)  new NRQn[Sn, An]
(3)  new TRQn[Sn, An]
(4)  TRQn[Sn, An] ⟵ TRQn−1[Sn−1, An−1]
(5)  upper_bound = n + 1
(6)  Assign Straining to snt and str
(7)  finish = FALSE
(8)  NonTrans_Tr_Steps = 0
(9)  Trans_Tr_Steps = 0
(10) repeat
(11)  NRQn[Sn, An], Stepsnt = RL_Sort(snt , NRQn[Sn, An])
(12)  TRQn[Sn, An] , Stepstr = RL_Sort(str , TRQn[Sn, An])
(13)  NonTrans_Tr_Steps = NonTrans_Tr_Steps + Stepsnt
(14)  Trans_Tr_Steps = Trans_Tr_Steps + Stepstr
(15)  Sort n! lists in Sn by NRQn, compute the average Avgnt and pick the list with max value as snt
(16)  Sort n! lists in Sn by TRQn, compute the average Avgtr and pick the list with max value as str
(17)  if (|Avgnt − Avgtr|/Avgtr <= 0.1) or (Avgnt <= upper_bound and Avgtr <= upper_bound)
(18)   finish = TRUE
(19) until finish is TRUE