Abstract

To address the degradation of diagnostic performance due to data distribution differences and the scarcity of labeled fault data, this study has focused on transfer learning-based cross-domain fault diagnosis, which attracts considerable attention. However, deep transfer learning-based methods often present a challenge due to their time-consuming and costly nature, particularly in tuning hyperparameters. For this issue, on the basis of classical features-based transfer learning method, this study introduces a new framework for bearing fault diagnosis based on supervised joint distribution adaptation and feature refinement. It first utilizes ensemble empirical mode decomposition to process raw signals, and statistical features extraction is implemented. Then, a new feature refinement module is designed to refine domain adaptation features from high-dimensional feature set by evaluating the fault distinguishability and working-condition invariance of feature data. Next, it proposes a supervised joint distribution adaptation method to conduct improved joint distribution alignment that preserves neighborhood relationships within a manifold subspace. Finally, an adaptive classifier is trained to predict fault labels of feature data across varying working conditions. To prove the cross-domain fault diagnosis performance and superiority of the proposed methods, two bearing datasets are applied for experiments, and the experimental results verify that the model built by the proposed framework can achieve desirable diagnosis performance under different working conditions and that it apparently outperforms comparative models.

1. Introduction

In the last several years, with the speedy and sustained advancement of modern industrial equipment, rotating machinery plays a major role in various production scenarios, such as transportation, mining, logistics, electricity, and manufacturing [1]. Due to that, the bearing is one of the most important units of industrial machinery, and the malfunction of the bearing may cause serious accidents and economic losses. Moreover, bearing typically operates under complicated operating circumstances, which may cause it easy to malfunction. Importantly and challengingly, it is mostly difficult to collect fault samples of real-world mechanical facilities under variable operating conditions [2]. Therefore, when facing real-world industrial scenes, the most existing artificial intelligence-based fault diagnosis techniques of rolling bearing still suffer from some challenges, such as data distribution differences and inadequate fault samples [3, 4].

Artificial intelligence technologies applied to fault diagnosis of bearings are mainly divided into three classes: classical machine learning-based method (CMLM), deep learning-based method (DLM), and transfer learning-based method (TLM) [5, 6]. Commonly, CMLM that has been widely studied since many years ago include the support vector machine (SVM) [7], artificial neural network (ANN) [8], k-nearest neighbor (KNN) [9], extreme learning machine (ELM) [10], and random forest (RF) [11]. These methods possess some major drawbacks, including heavy reliance on expert knowledge under variable working conditions and a default assumption that the samples share the same probability distribution [3, 6]. At present, DLM has attracted widespread attention and research with the help of their powerful ability to automatically extract deep features with better representation performance. Commonly, studied approaches include deep auto-encoder (DAE) [12], deep residual network [13], deep belief network (DBN) [14], and convolutional neural network (CNN) [15]. Nevertheless, several shortcomings of DLM are still prominent [1, 3]. Particularly, the fault diagnosis of rotating machinery based on traditional DLM adheres to the hypothesis that the data under diverse working conditions follow the identical distribution, which is adversarial to data distribution deviation under actual operating status. Furthermore, a bearing fault diagnosis model based on DLM requires sufficient training samples to achieve ideal fault diagnosis performance, which contradicts the insufficient fault data under actual industrial scenes. Furthermore, DLM usually involves a high-cost and high-time-consuming procedure to tune numerous hyperparameters [3].

To date, TLM has made increasing attention and research in cross-domain fault diagnosis (CFD) due to their distribution adaptation ability that is hope to tackle the above challenges of CMLM and DLM. TLM intends to learn the related domain knowledges from source domain (SD) and utilize them to target domain (TD). In the bearing fault diagnosis field, a fault dataset under one working state can constitute a domain. Transfer learning methods can be mainly divided into two classes: classical manual feature extraction-based transfer learning (TL) approaches and deep transfer learning (DTL) approaches [3, 16]. Although DTL methods have attracted increasing attentions in bearing fault diagnosis towards different working conditions, they still have some drawbacks. A common and important one is that a desirable DTL-based fault diagnosis model requires a high-cost and time-consuming procedure because of the adjustment of numerous hyperparameters. Accordingly, in this article, we focus on the typical feature-based TL approach to achieve the desirable CFD of rolling bearing in real-world industrial scenarios. Commonly studied feature-based TL methods mainly include the balanced distribution adaption (BDA) [17], joint distribution adaption (JDA) [18], transfer component analysis (TCA) [19], geodesic flow kernel (GFK) [20], and joint geometrical and statistical alignment (JGSA) [21]. Based on these methods, some intelligent models for cross-domain diagnosis have been investigated. In [22], a transfer deep learning network was proposed to resolve the drawbacks of existing rolling bearing fault algorithms on the basis of deep learning. In this network, the feature transfer using TCA and a pretrained convolutional neural network is performed. In [23], a source domain multisample JDA (SM-JDA) approach was used for the bearing fault diagnosis under variable operating conditions. In [24], the BDA was introduced to facilitate the domain adaptation on bearing cross-domain fault diagnosis. In [25], aiming at the domain shift (distribution discrepancy) issue in the field of bearing fault diagnosis, the multikernel joint distribution adaptation (MKJDA) with dynamic distribution alignment is proposed for bearing fault diagnosis. In [3], based on BDA, a new balanced adaptation regularization was designed to solve the problem of sample distribution discrepancy-caused degradation of CFD performance. In [26], an adaptive manifold probability distribution was studied for CFD; in this method, the GFK was implemented for distribution adaptation, and a domain adaptive classifier was further trained to diagnose the target domain under different working conditions. In [27], transfer sparse coding and JGSA were combined to construct a novel fault diagnosis approach for bearing under different operating status. Although the above-mentioned methods have successfully realized CFD of bearings, three issues are still blocking the application of these methods in actual industrial scenarios. (1) The implementation of distribution adaptation in most studies is based on the probability distributions alignment in the primitive characteristic space, which makes it difficult to tackle the issue of feature distortion and may lead to the poor domain adaptation (DA) performance [28]. (2) The goals of mostly distribution adaptation of TLM merely concentrate on decreasing probability distribution differences and enhancing the transferability of features, and the class distinguishability of features is usually neglected, which may lead to the poor classification performance [29, 30]. (3) In the process of distribution adaptation, the impact of class information and neighborhood relationships of feature data on distribution adaptation has not been effectively considered, which may restrict the CFD accuracy and generalization ability of the model [28, 31].

Considering three issues of the above-mentioned TLM approaches, we investigate a new DA idea, that is, joint distribution alignment with neighborhood relationship preserving in manifold subspace. Moreover, for improving the DA capability, we consider the impact of fault discriminability and working-condition invariance (WCI) of features in the procedure of DA. Therefore, we designed a feature refinement module to refine features with the better domain adaptability from the original high-dimensional feature set (OHFS). In view of the above discussion, this study proposes a new CFD framework for bearing on the basis of feature refinement and supervised JDA. There are four modules in this framework: signal processing and feature extraction module, feature refinement module, DA module, and classifier module for CFD. In the signal processing and feature extraction module, it uses ensemble empirical mode decomposition (EEMD) to decompose the raw signals collected from bearing and conducts feature extraction. For the feature refinement module, a domain adaptation feature refinement based on classification accuracy and distribution discrepancies (DFCD) is investigated to estimate the fault distinguishability and WCI of feature. In DA module, a new DA method, termed improved JDA with manifold subspace learning and neighborhood relationship preserving (IDAMN), is proposed. Finally, in the cross-domain classifier module, the classical machine learning classifier, the KNN, is trained by labeled data of SD, and the trained classifier predicts the labels of data of TD. The main contributions are summarized as follows:(1)A feature refinement module is designed, named domain adaptation feature refinement based on classification accuracy and distribution discrepancies (DFCD). The classical classifier KNN is utilized to estimate the fault distinguishability of features, and the maximum mean discrepancy (MMD) and Kullback–Leibler divergence (KLD) are employed to quantify the WCI of features. Accordingly, it constructs a new feature estimation index to refine DA features with better fault distinguishability and WCI from the primitive characteristic set.(2)It proposes a new DA method, improved JDA with manifold subspace learning, and neighborhood relationship preserving (IDAMN). IDAMN performs improved JDA of different domains in a learned manifold subspace with the consideration of neighborhood relationship preserving and category information, which help to shrink distribution differences while overcoming feature distortion and enhancing the discriminant performance of features.(3)Aiming at the key challenges still exist in applying artificial intelligence-based fault diagnosis approaches to actual application scenes, a new fault diagnosis framework constructed by the DFCD and IDAMN is designed, termed as DFCD-IDAMN. This framework can prominently strengthen CFD performance. Two bearing datasets are utilized to set up a series of CFD tasks in experimental verification. The outcome shows that DFCD-IDAMN significantly outperforms other comparative models that use common baseline methods.

The rest of the contents are arranged as follows. In Section 2, the preliminary knowledges of ensemble empirical mode decomposition, domain adaptation, MMD, and local fisher discriminant analysis are introduced, respectively. Section 3 describes the DFCD-IDAMN framework. In Section 4, the experimental validation is given to illustrate the performance of the proposed methods. The conclusions of this work are presented in Section 5.

2. Preliminaries

2.1. Ensemble Empirical Mode Decomposition (EEMD)

EEMD was proposed to overcome the mode confusion problem of empirical mode decomposition (EMD), its basic principle is that Gaussian white noise is added into raw signals, and signals can be automatically distributed to the appropriate reference scale. Therefore, EEMD can achieve the better time-frequency analysis of nonstationary vibration signals from bearings [32, 33]. The procedure of EEMD is illustrated in Figure 1, and the specific implementation process of EEMD is as follows [34]:(1)Given an original signal , set up the variable as 1, and set up the average times of EEMD as N.(2)Add the Gaussian white noise (GWN) to , and the signal can be obtained. The expression of is as follows:(3)Apply EMD to process , and various intrinsic mode functions (IMF) and the corresponding residual components can be obtained; the expression of can be presented as follows:where represents the j-th IMF component obtained by EMD, J is the numbers of IMF, and represents the residual components.(4)Add different GWN to and repeat steps (2) and (3), obtain the sum and average of the IMF components reached in N decompositions to offset the GWN, and the final IMF components can be obtained as follows:(5)Through the above steps, is finally decomposed to

2.2. Domain Adaptation and MMD

Domain adaptation (DA) is a bright transfer learning-based approach in the situation that traditional pattern recognition and classification models do not achieve ideal results due to the distribution discrepancies between the training and testing samples [1, 5]. Given a SD and a TD , and are, respectively, the number of samples from and , respectively. represents data of SD and TD; represents the corresponding label set of . and are drawn from two different probability distributions, and the optimization goal of DA is to shrink the distribution discrepancies between SD and TD [35].

MMD [36], a widely used nonparametric distance estimation in TL, was proposed by Gretton et al. for estimating the distance of distributions based on reproducing kernel Hilbert space (RKHS). The MMD between distributions of and can be expressed aswhere represents the RKHS norm and is the transformation function that transforms data to a RKHS. Aiming at the challenge of that inconsistent feature distribution is existed in CFD, the MMD has been widely utilized to estimate distribution discrepancies between domains and align data distributions.

2.3. Local Fisher Discriminant Analysis (LFDA)

LFDA was proposed by improving local fisher analysis (LFA) by Sugiyama [37], and it is a classical supervised dimensionality reduction approach. Let be d-dimensional data and be the corresponding category labels, where n and c are, respectively, the number of and the class number of data. According to the literature [37, 38], the objective of LDA is to maximize the proportion of the between-class scatter matrix (BSM) to the within-class scatter matrix (WSM) :where A is a mapping matrix, and the definitions of and are as follows:wherewhere is the number of samples in class l. Compared to LDA, the higher objective of LFDA is that the between-categories divisibility is maximized and the within-category local manifold structure is preserved simultaneously in a new feature space with reduced dimension. Based on the above and , the local relationship of feature data can be incorporated into the definition of weight. Accordingly, the new BSM and WSM have been substituted for and , respectively. The expressions of and are presented as follows [37]:wherewhere the definition of is shown as follows:where and are the local scaling around and .

3. DFCD-IDAMN Framework

To achieve a desirable CFD of bearing, this work designs a new DFCD-IDAMN framework based on the domain adaptation feature refinement method DFCD and supervised joint distribution adaptation IDAMN. The whole structure is presented in Figure 2. DFCD-IDAMN framework is constituted by four modules: signal processing and feature extraction module, feature refinement module, domain adaptation module, and adaptive classifier module. The specific introduction is presented as follows.

3.1. Signal Processing and Feature Extraction Module

Due to that, the original bearing vibration signals usually possess severe nonlinearity and nonstationarity; in an effort to tackle this issue and extract features that can effectively reflect fault states and help pattern recognition and classification, first of all, EEMD is applied to decompose the collected vibration signals into several different IMFs, and these IMFs are utilized to calculate the Hilbert envelope spectrum (HES) and Hilbert marginal spectrum (HMS). By using raw vibration signals, the decomposed IMFs, and the corresponding HES, this module calculates the statistical parameters of them to obtain the corresponding statistical features. The procedure of this module is drawn in Figure 3.

3.2. Feature Refinement Module

In order to strengthen the performance of domain adaptation procedure, this work designs the feature refinement module to refine domain adaptation features with more satisfying fault discriminability and WCI from the original high-dimensionality feature set. This module, domain adaptation feature refinement based on classification accuracy and distribution discrepancies (DFCD), is built by using the classical classifier KNN, maximum mean discrepancy (MMD), and Kullback–Leibler divergence (KLD). The structure of DFCD is drawn in Figure 4. The feature datasets extracted from vibration signals in a certain operating status and other operating states are used as SD and TD, respectively. In order to adapt to conditions as close as possible to actual industrial scenarios, the DFCD runs on this input: tagged feature data in fault states and normal state from SD, untagged feature data in fault states from TD, and feature data in normal state from TD. The reason for setting such input is that in actual industrial scenes, it is unknown that which category the newly collected samples belong to and samples in all fault states under one specific working condition are usually easy to prepare and obtain; therefore, the inputted feature data from TD is untagged. However, for any mechanical equipment, the samples in their normal state under all working conditions are easily accessible. Accordingly, the labeled feature data from SD are used to evaluate the fault discriminability due to its known label, and only feature data in the normal state from TD are used to measure the WCI of feature.

According to the structure shown in Figure 4, the labeled feature data (contains multiple fault categories) in a certain operating condition and normal status feature data in other operating conditions are used for feature evaluation. Firstly, it randomly divides the labeled feature data into the training and testing data, and it trains a KNN classifier to predict the class labels of the testing data. Accordingly, the classification accuracy of each feature can be used to measure the fault discriminability. Then, the normal state feature data in two working conditions is implemented to calculate the MMD and KLD of features, which accomplishes the quantification of the WCI of feature. Finally, a novel evaluation index for domain adaptation features refinement, the domain adaptability index (DAI), is built. In this study, we presume that the feature with higher DAI is more advantageous to domain adaptation and fault classification. The detailed description of DFCD is as follows.

3.2.1. Evaluate Fault Discriminability of Feature Based on Classification Accuracy

Given a high-dimensional original feature set (OFS) that includes P feature samples containing K class data, that is, . For each sample, it is constructed by Q features, that is, , and , where represents the q-th feature of the i-th sample. Accordingly, the OFS can be presented as follows:

The row of OFS, the first feature data , is used to obtain classification accuracy by KNN classifier. The labeled q-th feature data from source domain are randomly divided into the training dataset and the testing dataset . The and present the training and testing data samples, respectively. The and are the corresponding labels of and . On this basis, the is employed to train the KNN classification, and the trained KNN predicts the labels of . Accordingly, the predicted labels of , termed as , can be obtained. By comparing and , the number of correctly predicted samples for labels is obtained. Based on the and , the expression of accuracy (q) is presented as follows:

The remaining features are also handled in the same way. Let accuracy (q) denote the classification accuracy of the q-th feature. Therefore, it can obtain the classification accuracy sequence, . In this study, we presume that the higher value of classification accuracy indicates the better fault discriminability.

3.2.2. Measure WCI of Feature Based on MMD and KLD

For a more comprehensive WCI evaluation of features, MMD and KLD are employed to evaluate the distribution difference between feature samples from SD and TD. The basic principle of MMD is introduced in Section 2.2. The details of KLD are described as follows [39].

KLD is an effective metric tool to estimate the distribution differences [40], and it is often applied in the fields of statistical learning, information technique, signal processing, etc. Given two probability density functions of two different variables as and , the KLD is represented on the basis of the definition of information entropy.where the function has no symmetry, that is, .

According to the references [3941], the expression of KLD in symmetric form can be denoted as

Based on the basic principles of MMD and KLD, given normal state feature sets and from source and target domains, respectively, and are expressed as follows:where represents the q-th feature of the M-th sample from SD, represents the the q-th feature of the M-th sample from TD, and M is the number of normal state feature sample. The first row of and and the first feature data and are used to calculate the MMD and KLD, which can obtain the MMD and KLD of the first feature data between SD and TD. The remaining features data are also handled in the same way. Let mmd (q) and kld (q) denote the MMD and KLD of the q-th feature, respectively. Therefore, it can obtain the MMD sequence and MMD sequence . In this study, we presume that the WCI of feature is better when the sum of MMD and KLD is smaller.

3.2.3. Build the Domain Adaptability Index

According to the estimation of fault discriminability and WCI of features, based on the classification accuracy, MMD, and KLD, a new domain adaptability index, DAI, is proposed to assist refine domain adaptation features. For the n-th feature, the definition of DAI is presented as follows:where is a trade-off parameter. Then, we can obtain the DAI sequence of Q features, . In this work, it is supposed that the domain adaptability of feature is stronger when the corresponding value of DAI is higher. Accordingly, we can refine domain adaptation features from OHFS by sorting the DAI sequence in descending order, and the features with high DAI values are used to form feature subset for domain adaptation.

3.3. Improved JDA with Manifold Subspace Learning and Neighborhood Relationship Preserving (IDAMN)

Aiming at three significant issues of many existing DA approaches based on feature-based TL: (1) the implementation of distribution adaptation in most studies is based on the probability distributions alignment in the original complex and high-dimensional feature space, which is difficult to tackle the issue of feature distortion and may lead to the poor domain adaptation performance [28]. (2) The optimization goals of numerous ready-made DAs of TLM merely concentrate on decreasing the distribution differences and enhancing the transferability of features, and the class distinguishability of feature is usually neglected, which may lead to the poor classification performance [29, 30]. (3) In the process of distribution adaptation, the impact of class information and neighborhood relationships of feature data on distribution adaptation has not been effectively considered, which may degrade the CFD performance and generalization ability of the model [28, 30]. Therefore, in this section, on the basis of the idea that is joint distribution alignment with neighborhood relationship preserving in manifold subspace, a novel domain adaptation method, IDAMN, is designed. There are four steps of IDAMN. (1) Grassmann manifold subspace learning; (2) joint distribution alignment; (3) neighborhood relationships preserving; and (4) improved joint distribution adaptation. The details of IDAMN are presented as follows.

3.3.1. Grassmann Manifold Subspace Learning

This work applies the classical unsupervised manifold learning approach of the geodesic flow kernel (GFK) to learn low-dimensional manifold structure of feature set in original high-dimensional space [42]. Accordingly, some features with certain geometrical structures in the manifold subspace can be obtained, which can overcome the problem of feature distortions in the raw feature space [28, 43]. Given that the labeled feature dataset of SD and TD are, respectively, expressed as and , then, the GFK is implemented to map original feature data and into Grassmann manifold (GM) space G(d) by [20, 42], and the and can be obtained, respectively. The detailed introduction of GFK can be referred to [20, 42].

In particular, the prevailing subspace dimension of GFK must be set to less than half of the input feature space dimension. Therefore, aiming at the scenario that the input feature dimension is less than twice the set dimension of manifold subspace, before executing unsupervised manifold learning of GFK, it will conduct dimension size comparison and automatic adjustment. Specifically, if the feature dimension is less than twice the dimension of the set manifold subspace, the dimension of the manifold subspace will be set as the half of the feature dimension. Conversely, if the feature dimension is greater than twice the dimension of the set manifold subspace, GFK will be implemented under the set manifold subspace dimension.

3.3.2. Joint Distribution Alignment

In order to further shrink the distribution divergences between SD and TD, joint distribution alignment is introduced. It includes two parts: marginal distribution alignment (MDA) and conditional distribution alignment (CDA).

(1) MDA. Let and denote the representations of SD and TD data on the GM space, respectively. The corresponding marginal distributions of them are and . The marginal distribution alignment is conducted by minimizing the MMD between and [17]. The expression of MMD between and is shown as follows:where H represents the RKHS. represents the trace of , is optimal transformation matrix, and Z denotes the input feature data matrix composed of and . The definition of matrix is presented as follows:where and are the number of and , respectively. By minimizing equation (22), a new representation can be obtained to achieve that the marginal distribution discrepancies between the SD and TD are narrowed.

(2) CDA. The CDA is conducted by minimizing the MMD between conditional distributions and [18]. Aiming at the lack of , it utilizes base classifier f trained on the with , the pseudo labels of the TD data can be easily predicted by f [18]. Due to that, the and are posterior probabilities and quite involved, it can explore the sufficient statistics of and instead [44]. c is the category in the label set and (C is the total number of categories) [18]. Therefore, the MMD between the and can be expressed as follows:where and are, respectively, feature sets pertaining to class c. is the pseudo tag of the TD data . and are the number of samples pertaining to class c, respectively. Accordingly, the MMD matrix can be obtained by the following equation:

When the minimum of equation (24) is achieved, a new representation can be obtained to achieve that the conditional distribution discrepancies between and are narrowed.

3.3.3. Neighborhood Relationships Preserving

In order to consider the impact of class information and neighborhood relationships of feature data in the process of distribution adaptation, inspired by the principles of LDA [45] and LFDA [37], a new local minimum margin criterion matrix (LMMCM) is designed to utilize the label information while preserving the local neighborhood geometry of the feature data. The expression of LMMCM is presented as follows:where and are local WSM and local BSM. The and are expressed as follows:wherewhere , l, and are, respectively, the number of feature sample, class label of feature sample, and the number of feature samples that belongs to the class l. and constitute weight matrices. In , the meaning of is that j is the nearest neighbor of i and they pertain to different classes. is defined as follows:where represents the local scaling around and is the m-th nearest neighbor of . When and are closer, the is larger, if not, the is smaller. By introducing the LMMCM, the local neighborhood geometry of the feature data, including the neighborhood relationships between data of the same category and the neighborhood relationships between data of different classes, can be considered. Furthermore, the class label information is effectively introduced, and it can improve the discriminability of feature data by minimizing the LMMCM.

3.3.4. Improved Joint Distribution Adaptation

On the basis of the above three contents, we design an improved joint distribution adaptation, ; it is defined as follows:where and are adjustable parameters and tunes the proportion of the marginal and conditional distributions adaptation. According to equations (22) and (24), the can be further expressed as follows:

According to the optimization objective of JDA and equation (26), the optimization goal of IDAMN can be defined aswhere is the regularization parameter with the Frobenius norm and is used to ensure the optimization problem to be well-defined. and represent the unit matrix and centering matrix, respectively. , and is the matrix of ones. For the solution of equation (34), based on the constrained optimization theory, set Lagrange multipliers ; accordingly, the Lagrange function for solving equation (34) is as follows:

By setting derivative , the solution of equation (34) can be derived as a generalized eigendecomposition problem as follows:

According to equation (36), finally, the optimal adaptation matrix W is built by using the k smallest eigenvectors, and new feature representations and are obtained. Then, it can use labeled to learn an adaptive classifier f, and the learned adaptive classifier f is employed to predict the label of unlabeled .

In summary, the overall complete procedures of IDAMN are presented as follows:(1)Input: Source and target domains feature set and , true labels of , manifold subspace dimension d, regularization parameters , , and , and dimension of output source and target domains feature space k. The iteration is i. The dimension of or is . When the , the manifold subspace dimension d will be set as .(2)By equation (28), learn the Grassmann manifold transformation kernel G to transform the original feature data ( and ) into G(d) with . Accordingly, the new source domain and new target domain are obtained.(3)Learn a base classifier on and conduct prediction on to obtain its pseudo labels .(4)Constitute ; compute and by equations (23) and (25). Compute and by equations (27) and (28).(5)Solve the eigendecomposition problem in equation (36) and use k smallest eigenvectors to form adaptation matrix W, and .(6)Train an adaptive classifier f on and update the pseudo labels of target domain data, .(7)Construct the MMD matrices by equation (25).(8)Repeat the step (4) until the iteration i .(9)Output the learned adaptive classifier f.

3.4. Complete Process of the Cross-Domain Fault Diagnosis Based on the DFCD-IDAMN

Based on the DFCD-IDAMN framework and cross-domain fault diagnosis tasks, the complete process is described in detail as follows:(1)Input collected fault vibration signals under a specific working condition and unknown working condition, and denote and , respectively. and represent source and target domains data, respectively. The class information of is known, but the class information of is unknown.(2) and are decomposed into several different IMFs by EEMD, respectively. Then, these IMFs are utilized to calculate HES and HMS. On the basis of raw vibration signals and , the IMFs, the corresponding HES and HMS, it calculates statistical parameters of them to obtain the corresponding statistical features, and the high-dimensional statistical feature sets and of source and target domains are built.(3) and are inputted in the DFCD module, and set parameter in this step, the fault discriminability evaluation of features, and the WCI measurement of features are conducted, which can obtain the domain adaptability index of features for refining and . Therefore, the new feature sets of source and target domains and are obtained for the subsequent step.(4)Input: Source and target domains feature set and , true labels of , manifold subspace dimension d, regularization parameters , , and , and dimension of output source and target domains feature space k. The iteration is i. On this basis, the proposed IDAMN is performed; accordingly, new feature sets , and adaptive classifier f are obtained. Finally, the cross-domain fault diagnosis accuracy is calculated.

4. Experimental Verification

In this work, for validating the performance and superiority of the proposed methods, two bearing fault datasets, obtained from the Case Western Reserve University (CWRU) test platform [3, 6, 29, 31, 4648] and the SQI-MFS test platform [29, 31, 44, 46] are employed for a set of case studies. To clearly illustrate the superiority of the proposed methods (DFCD and IDAMN), some comparative models are built by ready-made common methods: KNN, SVM, DAE, CNN, DBN, JDA, TCA, JGSA, BDA, and GFK.

4.1. Case 1: Fault Diagnostic of Bearing Dataset 1 across Different Working Loads
4.1.1. Description of Bearing Dataset and Fault Diagnosis Tasks

In case 1, it utilizes the bearing vibration dataset gained from the CWRU test-bed to conduct CFD experiments. The test platform is presented in Figure 5. This bearing vibration signals are sampled through acceleration sensors under 12 kHz sampling frequency. Table 1 lists the description of bearing vibration dataset. There are three categories of bearing defect: inner raceway defect (IRD), ball defect (BD), and outer raceway defect (ORD). The defect parameters include 0.028 inch, 0.021 inch, 0.014 inch, and 0.007 inch. Moreover, vibration data for bearings without defects is also used. In order to set CFD tasks, bearing data under motor loads of 0 hp, 1 hp, 2 hp, and 3 hp are chosen for experiments. Therefore, it can obtain bearing vibration data of 12 classes, labeled 1–12. For each class, 60 samples are used to build a training set and a testing set, and it randomly divides 20 and 40 samples as training and testing samples. Each sample is composed of 2000 continuous data points from original vibration signals. Based on the bearing data presented in Table 1, 12 CFD tasks are arranged, as listed in Table 2.

4.1.2. Diagnosis Results of the DFCD-IDAMN Framework

In this section, according to the overall procedure of the DFCD-IDAMN framework shown in Figure 2, it first conducts signal processing and feature extraction, and primitive signals are decomposed into several IMFs by EEMD. Although the obtained IMFs are distributed from high frequency to low frequency by default, it is not that each IMF can represent the time-frequency feature of a fault signal with effect [49]. For this issue, according to related previous research work [4951], correlation coefficient between each IMF and raw vibration signal is utilized to reduce redundant IMFs. The IMF is more closely related to the original vibration signal and has richer time-frequency information when the value of the correlation coefficient is higher. Therefore, in this work, we refer to the literature [49]; the first four IMFs are used for feature extraction; furthermore, four Hilbert envelope spectrums (HES) of four IMFs and one Hilbert marginal spectrum (HMS) calculated from four IMFs are also used to generate statistical features. Accordingly, it can obtain 4 IMFs, 4 HES, and 1 HMS from a vibration signal, then calculate 18 statistical parameters [29, 31, 44, 5255] of them listed in Table 3, from which 162 statistical features can be extracted to form the original high-dimensional feature set. Vibration signal samples of no defect bearing and inner raceway defect under motor loads of 0 hp, 1 hp, 2 hp, and 3 hp are presented in Figure 6. The corresponding IMFs from these samples are presented in Figures 7 and 8.

Secondly, it carries out the feature refinement module. The proposed DFCD evaluates the fault distinguishability and WCI of 162 statistical features, which obtains the DAI of them and helps to refine features with better domain adaptability from high-dimensional original feature set. Take the no defect vibration data under motor load of 0 hp as an example. Figure 9 presents the DAI of 162 statistical features. From the figure, it can be seen that different features have different DAI values, and it indicates the different domain adaptability quantification results of different features. For the 39th and 42nd features, their DAI values are significantly higher than other features, and it shows that their domain adaptability is more prominent. Therefore, in this study, we assume that the higher DAI value indicates the greater domain adaptability. Therefore, the DFCD can help to refine some features (they are more advantageous to domain adaptation) by manually select a threshold of the DAI value, and these refined features are processed by the subsequent domain adaptation module.

Next, the refined features obtained by performing the feature refinement module constitute a cross-domain adaptation feature set (CDAF), and the labeled CDAF of the SD and the unlabeled CDAF of the TD are inputted into the proposed IDAMN domain adaptation method, achieving the joint distribution alignment with neighborhood relationship preserving is performed in Grassmann manifold subspace, and learning an adaptive classifier f for CFD. Finally, the learned classifier f is learned and it can predict the labels of the target domain feature set; therefore, the CFD result can be calculated.

After performing the above steps, the experimental results of 12 CFD tasks are listed in Table 4. It shows the mean diagnosis accuracies of 12 bearing defect types under different numbers of domain adaptation features (nf). According to the diagnosis accuracies of these 12 CFD tasks, it can easily conclude the following analysis. Firstly, the proposed DFCD-IDAMN framework for CFD of bearings can achieve ideal fault diagnosis result. The diagnosis accuracies of tasks 2, 4, 5, 6, 9, and 12 can reach 100% with the suitable nf. Tasks 1, 3, and 7 can attain over 99.5% diagnosis accuracy. Accordingly, the effectiveness of the DFCD-IDAMN framework can be validated. Secondly, it is evident that the use of the proposed DFCD has an apparent effect on the fault diagnosis accuracy. Without using DFCD, all of 162 features are utilized for the subsequent IDAMN domain adaptation method and CFD, the diagnosis result is not ideal. The diagnosis accuracies of tasks 1–12 are 96.46%, 99.58%, 83.33%, 99.17%, 100.00%, 89.58%, 98.54%, 97.29%, 99.38%, 95.83%, 82.29%, and 99.58%, respectively. When the DFCD is applied and the refined CDAF is employed for the subsequent procedure, it can attain desirable CFD accuracies that are apparently higher than that of diagnosis without using DFCD. The maximum accuracies (mda) of 12 CFD tasks are 99.79%, 100.00%, 99.58%, 100.00%, 100.00%, 100.00%, 99.58%, 98.54%, 100.00%, 96.88%, 91.67%, and 100.00%, respectively. Therefore, the effectiveness of the DFCD with a suitable nf for improving fault diagnosis accuracy can be verified. The above CFD experiment involves some parameters of DFCD and IDAMN that need to be manually chosen. For the basis for setting hyperparameters of the proposed methods, the specific values of these parameters are set based on experimental experience. Therefore, we directly present the relevant parameter values in this manuscript. For the DFCD, the corresponding parameters set in DFCD include trade-off parameter . The parameters set in IDAMN include manifold subspace dimension , regularization parameters , , and , dimension of output source and target domains feature space k = 20. Iterations i = 10. In particular, although the manifold subspace dimension is set as 50, when the feature dimension after the proposed feature refinement (that is nf) is less than twice of the set manifold subspace dimension, the manifold subspace dimension will be automatic adjusted as the half of nf. In Table 4, when the nf is 40, 50, 60, 70, 80, and 90, the manifold subspace dimension will be automatic adjusted as 20, 25, 30, 35, 40, and 45. On the contrary, when nf is not less than twice of the set manifold subspace dimension (when the nf is 100 to 162), the GFK is implemented under the set manifold subspace dimension 50.

4.1.3. Comparative Analysis with Other Fault Diagnosis Models

In an effort to further validate the advantages of the DFCD-IDAMN framework for CFD, some common and competitive approaches are used to conduct a series of comparison experiments, these methods include KNN, SVM, DAE, CNN, DBN, JDA, TCA, JGSA, BDA, and GFK. The reason of this set up is as follows: (1) it choses three categories methods: classical machine learning methods, classical deep learning methods, and classical transfer learning methods, which are used to compare the effectiveness differences between them. (2) KNN and SVM are classic classifiers that have been widely used and are very representative. (3) DAE, CNN, and DBN are widely developed and studied classical deep learning approaches. (4) JDA, TCA, JGSA, BDA, and GFK are representative transfer learning methods that have gradually received attention and study from many researchers in recent years.

Table 5 presents comparative models built by these methods, DFCD and IDAMN. These comparative models are labeled as M1–M18 and can be divided into three types. (1) The models are not combined with domain adaptation methods, and they only utilize the original high-dimensional feature set (OHFS) and classical classifiers. Take M1 as an example; it is a classical classifier-based model, and the OHFS is directly inputted in the SVM classifier for cross-domain fault diagnosis. (2) The models are combined with domain adaptation methods, and they use the OHFS, domain adaptation methods, and base classifier. Take M7 as an example, it is a domain adaptation-based model. The OHFS is firstly processed by TCA, and the output features are inputted in the KNN classifier. (3) The models are combined with DFCD and domain adaptation methods, and they use the OHFS, DFCD, domain adaptation methods, and base classifier. Take M13 as an example, the OHFS is firstly refined by the proposed DFCD, then, the refined features are processed by TCA, and finally the output features are inputted in the KNN classifier.

The fault diagnosis results of M1–M18 models are shown in Table 6 and Figures 1014. It is obvious that the M18 model obtained by the proposed DFCD-IDAMN framework can achieve the better CFD performance than other comparative models. The detailed comparative analysis can be easily drawn as follows. (1) Compared with the M1–M6 (base classifier-based models), the fault diagnosis accuracies of tasks 1–12 of DFCD-IDAMN model are remarkably higher than that of M1–M6 models. In Figure 14, the mean fault diagnosis accuracy of 12 tasks of DFCD-IDAMN model can reach 99.57%, which is respectively 8.11%, 8.69%, 7.36%, 12.32%, 13.37%, and 18.01% higher than M1–M6 models. (2) Comparing OHFS-IDAMN (M12) model with M7–M11 models (domain adaptation-based models), the accuracies of 12 tasks are noticeably higher than M7–M11 models. Accordingly, the DA ability of the IDAMN outperforms traditional TCA, JDA, BDA, JGSA, and GFK. (3) Comparing M7–M12 (domain adaptation-based models without DFCD) with M13–M18 (domain adaptation-based models with DFCD), it is easily found that the use of the DFCD has a significant enhancement on the fault diagnosis accuracy of domain adaptation-based model, take OHFS-TCA (M7) and OHFS-DFCD-TCA (M13) as examples, the diagnosis accuracies of tasks 1–12 of M13 model are, respectively, 98.75%, 99.79%, 90.63%, 97.92%, 100.00%, 97.92%, 97.50%, 96.67%, 99.38%, 89.79%, 97.92%, and 100.00%, which surpasses that of the M7 model. Therefore, it implies that the DFCD can help to refine features with strong domain adaptability, which can effectively strengthen DA performance and increase fault diagnosis accuracy.

Moreover, we select some other literature that used similar DA methods for cross-domain fault diagnosis experiments that are similar to ours and compare our experimental results with them. Table 7 presents the comparison results. It is obviously true that our proposed fault diagnosis method outperforms other methods proposed in the corresponding literatures. To sum up, extensive comparative experiments are conducted, and the results prove the validity and advantages of the DFCD-IDAMN framework under diverse working loads.

4.2. Case 2: Fault Diagnostic of Bearing Dataset 2 across Different Working Speeds
4.2.1. Description of Bearing Dataset and Fault Diagnosis Tasks

To further prove the validity and flexibility of the DFCD-IDAMN framework for CFD, in this case, it utilizes bearing vibration dataset sampled from the SQI-MFS test platform to conduct fault diagnosis experiments. The test platform is presented in Figure 15. This bearing vibration signals are sampled through acceleration sensors under 16 kHz sampling frequency. Table 7 lists the description of bearing vibration dataset. There are three categories of bearing defect: inner raceway defect (IRD), ball defect (BD), and outer raceway defect (ORD). The defect parameters include 0.05 mm, 0.1 mm, and 0.2 mm. Moreover, vibration data for bearings without defects is also used. To set CFD tasks, it utilizes the bearing vibration data under different motor speeds for implementing experiments. Therefore, it can obtain bearing vibration data of 10 classes, labeled 1–10. For each class, 90 samples are used to build a training set and a testing set, and it randomly divides 30 and 60 samples as training and testing samples. Each sample is composed of 5000 continuous data points from original vibration signals. On the basis of the bearing vibration data listed in Table 8, it sets 2 CFD tasks for experiments, and the details are shown in Table 9.

4.2.2. Diagnosis Results of the Proposed DFCD-IDAMN Framework

To further demonstrate the performance and advantages of the DFCD-IDAMN framework, bearing datasets from the SQI-MFS test-bed under diverse working speeds are employed for CFD experiments, and the contents are similar to that of case 1. Take the no defect vibration data under a motor speed of 1730 rmp as an example. Figure 16 presents the DAI of 162 statistical features. From the figure, it can be seen that different features have different DAI values, and it indicates the different domain adaptability quantification results of different features. For the 3rd, 6th, 16th, 21st, and 24th features, their DAI values are significantly higher than other features, and it shows that their domain adaptability is more significant. Due to that, this work assumes that the higher DAI value indicates the greater domain adaptability; therefore, the DFCD can help to refine some features (they are more advantageous to domain adaptation) by manually select a threshold of the DAI value, and these refined features are processed by the subsequent domain adaptation module. Table 10 lists the diagnosis results of 2 CFD tasks under different nf, it is easy to draw conclusions similar to the experimental analysis for case 1. Firstly, the model built by the DFCD-IDAMN framework attains an ideal result, the maximum diagnosis accuracies of tasks 1 and 2 are 91.83% and 95.17%, respectively. Secondly, the significant enhancement effect of the use of DFCD on CFD performance is further proven. When the DFCD is not applied, all of 162 features are employed for the subsequent IDAMN domain adaptation method and fault classification, the diagnosis result (task 1: 86.17%, task 2: 86.83%) is not ideal. When the DFCD is used and the refined CDAF is employed for the subsequent procedure, it can attain obviously improved CFD accuracies. Therefore, the effectiveness of DFCD-IDAMN framework is validated again. The above CFD experiment involves some parameters of DFCD and IDAMN that should be manually set. For the basis for setting hyperparameters of the proposed methods, the specific values of these parameters are set based on experimental experience. Therefore, we directly present the relevant parameter values in this manuscript. For the DFCD, trade-off parameter . The parameters manual set in IDAMN include: manifold subspace dimension , regularization parameters , , and , dimension of output source and target domains feature space k = 20. Iterations i = 10. In particular, although the manifold subspace dimension is set as 40, when the feature dimension after the proposed feature refinement (that is nf) is less than twice of the set manifold subspace dimension, the manifold subspace dimension will be automatic adjusted as the half of nf. In Table 10, when the nf is 40, 50, 60, and 70, the manifold subspace dimension will be automatic adjusted as 20, 25, 30, and 35. On the contrary, when nf is not less than twice of the set manifold subspace dimension (when the nf is 80 to 162), the GFK is implemented under the set manifold subspace dimension 40.

4.2.3. Comparative Analysis with Other Fault Diagnosis Models

The comparative models used in this section are also shown in Table 6, and the experimental contents are the same as case 1. The corresponding cross-domain fault diagnosis results are listed in Table 11 and Figure 17. It is also obviously concluded that the performance of the model built by the DFCD-IDAMN framework significantly surpasses that of the other models. The detailed comparative analysis is illustrated as follows. (1) Comparing the DFCD-IDAMN model with M1–M6 (base classifier-based models), the diagnosis accuracies of tasks 1 and 2 of DFCD-IDAMN model are remarkably higher than that of the M1–M6 models. Moreover, the OHFS-IDAMN model can achieve the higher diagnosis accuracies in tasks 1 and 2 than M1–M6 models. (2) Comparing the OHFS-IDAMN (M12) model with the M7–M11 models (domain adaptation-based models), the diagnosis accuracies of tasks 1 and 2 are noticeably higher than M7–M11 models. The accuracy of the M12 model in task 1 can attain 86.17%, which is, respectively, 10.17%, 3.67%, 20.00%, 6.00%, and 11.67% higher than the M7–M11 models. Accordingly, for domain adaptation ability, it is evident that the proposed IDAMN outperforms traditional JDA, BDA, TCA, JGSA, and GFK, which can effectively increase the CFD accuracy. (3) Comparing M7–M12 (domain adaptation-based models without DFCD) with M13–M18 (domain adaptation-based models with DFCD), it is easily found that the utilization of the DFCD has a remarkable improvement on the diagnosis accuracy of domain adaptation-based model, take OHFS-JDA (M8) and OHFS-DFCD-JDA (M14) as examples, the accuracies of tasks 1 and 2 of the M14 model are, respectively, 89.00% and 83.83%; nevertheless, the M8 model only attains 82.50% and 72.00% accuracies, respectively, which is obvious inferior than the M14 model. Accordingly, the above experimental analysis once again shows that the DFCD can help to refine features with strong domain adaptability, which can effectively enhance domain adaptation performance and increase CFD accuracy. To sum up, extensive experiments are carried out, and the results further validate the validity, adaptability, and superiority of the DFCD-IDAMN framework under diverse working speeds.

5. Conclusions

This work designs a new framework based on the proposed DFCD and IDAMN for rolling bearing across diverse operating conditions. In this framework, the EEMD is first applied for signals processing and statistics-based features extraction. Then, the DFCD is employed to refine the features by evaluating the fault distinguishability and WCI. Next, the IDAMN is performed to maps the feature data into a GM subspace and further achieves improved JDA with neighborhood relationship preserving. Finally, an adaptive classifier is trained for fault diagnostic.

By utilizing bearing data collected from two experimental platforms, extensive fault diagnosis experiments are conducted. These experimental results show the following: (1) the DFCD can effectively refine features with the better domain adaptability; accordingly, the utilization of the DFCD has a significant enhancement on the diagnosis accuracy of domain adaptation-based models. (2) IDAMN possesses more robust domain adaptation ability than JDA, TCA, BDA, JGSA, and GFK. (3) The model built by the DFCD and IDAMN can attain a desirable cross-domain fault diagnosis accuracy with a suitable nf, which presents a promising capability for employing it in practical industrial scenarios with variable working conditions. In future, we are planning to develop stronger domain adaptation-based approaches for more complicated fault detection scenes and conduct research on adaptive optimization methods for related parameters used in the proposed methods.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by the Innovation and Entrepreneurship Training Program for College Students of China under Grant no. 202210357121 and the Joint Funds of the Zhejiang Provincial Natural Science Foundation of China under Grant no. LTY22E050001.