Shock and Vibration

Research Article

Multiscale Time-Frequency Sparse Transformer Based on Partly Interpretable Method for Bearing Fault Diagnosis

Training of MTFST.

	Input: Three multiscale TFRs , , where , , , and , which denote the fault types.
(1)	Set training batch , training epoch max_epoch, token embedding dimension , self-attention weight matrix size , number of head , positionwise forward network weight matrix size , block number of encoder , block number of decoder , and number of fault types .
(2)	Initialize trainable parameters of MSTFT
(3)	for epoch in 1, 2, …, max_epoch do
(4)	for step in 1, 2, …, max_step do
(5)	//Tokenizer
(6)	for each in , in and in do
(7)	Reshape , , to = , = and = then slice into patches sequence [], [], [];
(8)	Add position encoding, obtain , , ;
(9)	end Stack batches, obtain sequences , , .
(10)	//Encoders
(11)	for in 1, 2, …, do
(12)	,
(13)	;
(14)	,
(15)	;
(16)	,
(17)	.
(18)	end
(19)	//Decoder
(20)	for in 0, 1, 2, …, do
(21)	If (block = = 0)
(22)	,
(23)	;
(24)	else
(25)	,
(26)	;
(27)	end
(28)	//Classifier
(29)	Obtain feature matrix ;
(30)	;
(31)	;
(32)	Batch loss ;
(33)	Calculate gradients , ;
(34)	Update parameters , ;
(35)	end
(36)	end
	Output: Weights and biases