Research Article

Modeling the Peak-Period Bus Commuting Behavior with Staggered Work Hours Using a Regret-Minimizing Learning Method

Algorithm 1

Algorithm overview (for one agent only).
(1)Initialize Q-table: ;
(2)Initialize history of estimates: ;
(3)Initialize learning and exploration rates: , ;
(4)Fordo
(5) Receive app recommendations ;
(6) Update learning and exploration rates: , ;
(7) Choose action using policy derived from Q-table;
(8) Take action and observe the commuting cost using equation (1);
(9) Update estimate using equation (5);
(10) Update regret of action using equation (6);
(11) Update Q value of using equation (8):
(12)End