Research Article

Multiscale Time-Frequency Sparse Transformer Based on Partly Interpretable Method for Bearing Fault Diagnosis

Figure 8

The attention weights of each fault label. From top to bottom, odd rows present the weights map of the first attention layer in encoder1, encoder2, encoder3, and decoder, respectively, and even rows correspond to the last layer. And the most active patches are in the dashed white box.