Journal of Advanced Transportation

Research Article

Modeling the Peak-Period Bus Commuting Behavior with Staggered Work Hours Using a Regret-Minimizing Learning Method

Algorithm overview (for one agent only).

(1)	Initialize Q-table: ;
(2)	Initialize history of estimates: ;
(3)	Initialize learning and exploration rates: , ;
(4)	Fordo
(5)	Receive app recommendations ;
(6)	Update learning and exploration rates: , ;
(7)	Choose action using policy derived from Q-table;
(8)	Take action and observe the commuting cost using equation (1);
(9)	Update estimate using equation (5);
(10)	Update regret of action using equation (6);
(11)	Update Q value of using equation (8):
(12)	End