Research Article
An Empirical Investigation of Transfer Effects for Reinforcement Learning
Algorithm 2
The algorithm for training the non-transfer and transfer RL methods.
input: Straining, TRQn−1[Sn−1, An−1] | (1) | initialize | (2) | new NRQn[Sn, An] | (3) | new TRQn[Sn, An] | (4) | TRQn[Sn, An] ⟵ TRQn−1[Sn−1, An−1] | (5) | upper_bound = n + 1 | (6) | Assign Straining to snt and str | (7) | finish = FALSE | (8) | NonTrans_Tr_Steps = 0 | (9) | Trans_Tr_Steps = 0 | (10) | repeat | (11) | NRQn[Sn, An], Stepsnt = RL_Sort(snt , NRQn[Sn, An]) | (12) TRQn[Sn, An] , Stepstr = RL_Sort(str , TRQn[Sn, An]) | (13) | NonTrans_Tr_Steps = NonTrans_Tr_Steps + Stepsnt | (14) | Trans_Tr_Steps = Trans_Tr_Steps + Stepstr | (15) | Sort n! lists in Sn by NRQn, compute the average Avgnt and pick the list with max value as snt | (16) | Sort n! lists in Sn by TRQn, compute the average Avgtr and pick the list with max value as str | (17) | if (|Avgnt − Avgtr|/Avgtr <= 0.1) or (Avgnt <= upper_bound and Avgtr <= upper_bound) | (18) | finish = TRUE | (19) | until finish is TRUE |
|