Abstract

The performance evaluation of fault diagnosis algorithm is an indispensable link in the development and acceptance of the fault diagnosis system. Aiming at the stability evaluation of the fault diagnosis model based on the characteristic clustering, an image edge detection method based on the Elliptic Fourier Descriptor (EFDSE) is proposed to evaluate the stability of the fault diagnosis model, which applies similarity measurement of image to effective evaluation of faulty diagnosis algorithm. The quantitative evaluation index of the diagnostic capability of characterization based cluster fault diagnosis model is used to provide reference for the acceptance and reliability of the diagnosis results. Finally, the effectiveness of the stability evaluation is verified by the fault data of the motor bearings.

1. Introduction

With the development of modern industrial technology and information technology, manufacturing systems in various fields such as new energy, communication, computer, and industry are becoming more and more complex. Due to the complexity of the structure and the influence of various potential factors, the system inevitably exists as the hidden trouble. Once the hidden danger is induced, the personnel and economic loss of different degree will be caused. Therefore, the method of system fault diagnosis has become the focus of researchers. There are three common fault diagnosis methods: fault diagnosis based on control model [1], diagnosis based on statistical method [2], and fault diagnosis based on Artificial Intelligence [3]. At present, a large number of studies have focused on optimizing the stability of the fault diagnosis model, usually measured by the degree of diagnosis, and the higher the stability of the model in practical application is, the more cost will be paid. Therefore, it is necessary to analyse the effect of model stability on the effectiveness of fault diagnosis. However, there is still a lack of unified system for measuring the accuracy of models. The main methods are relative deviation and residual squared sum method [4] which are error analysis method, grey correlation theory [5] and ED train based on statistical data, Confidence interval [6], etc. The grey correlation theory can realize the diagnosis of multi data input, but it can only be applied within the range of the same characteristic parameters and cannot be used to compare with fault diagnosis with parameters of different feature ranges. Residual squared sum method is an evaluation method for regression model. It is not conducive to relative comparison between different fault diagnosis models, which is influenced by the absolute value of the dependent variable and the independent variable. The confidence interval index is based on the hypothesis that the result of the training data group's diagnosis conforms to the normal distribution. It will produce a large number of errors in the case of small data, and the upper limit of confidence interval does not converge to 1 with the increase of the accuracy. Therefore, it is not suitable for models with high accuracy in fault diagnosis.

On the basis of the above research, considering the true distribution of the fault diagnosis output, the stability evaluation method of fault diagnosis model based on Elliptic Fourier Descriptor is proposed, which apply similarity measurement to evaluation of faulty diagnosis algorithm and can provide objective evaluation without understanding the parameters of the fault diagnosis model, when using the fault diagnosis model based on the feature clustering to diagnosis faulty. The application example of motor bearing diagnosis is compared and verified, which proves that the method proposed in this paper is effective.

2. System Description and Model

In order to train and verify the new technology and new theory of motor bearing fault, a motor bearing state evaluation system developed by Rockwell has obtained a series of motor performance database [7] which can be used to verify or improve the performance evaluation of motor. Some projects that have been or are making use of these databases include Winsnode state assessment technology, model-based diagnosis technology, and motor speed determination algorithm. The experimental platform is shown in Figure 1.

As shown in Figure 1, the train stand consists of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The train bearings support the motor shaft. Single point faults were introduced to the train bearings using electrodischarge machining with fault diameters of 7 mils, 14 mils, 21 mils, 28 mils, and 40 mils (1 mil=0.001 inches). See FAULT SPECIFICATIONS for fault depths. SKF bearings were used for the 7, 14, and 21 mils diameter faults, and NTN equivalent bearings were used for the 28 mil and 40 mil faults. Drive end and fan end bearing specifications, including bearing geometry and defect frequencies, are listed in the BEARING SPECIFICATIONS.

Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. Accelerometers were placed at the 12 o’clock position at both the drive end and the fan end of the motor housing. During some experiments, an accelerometer was attached to the motor supporting base plate as well. Vibration signals were collected using a 16 channel DAT recorder and were post processed in a Matlab environment. All data files are in Matlab (.mat) format. Digital data was collected at 12,000 samples per second, and data was also collected at 48,000 samples per second for drive end bearing faults. Speed and horsepower data were collected using the torque transducer/encoder and were recorded by hand.

Outer raceway faults are stationary faults; therefore placement of the fault relative to the load zone of the bearing has a direct impact on the vibration response of the motor/bearing system. In order to quantify this effect, experiments were conducted for both fan and drive end bearings with outer raceway faults located at 3 o’clock (directly in the load zone), at 6 o’clock (orthogonal to the load zone), and at 12 o’clock.

Data files are in Matlab format. Each file contains fan and drive end vibration data as well as motor rotational speed. For all files, the following item in the variable name indicatesDE: drive end accelerometer dataFE: fan end accelerometer dataBA: base accelerometer datatime: time series dataRPM: rpm during training

Part of the data is shown in Table 1 part of the drive end bearing failure data.

3. Fault Diagnosis Model Based on Feature Clustering

Clustering analysis is a kind of unsupervised learning. It does not need to define the classes in advance or give a training sample to indicate what the data should have. Data sets can be divided into a number of different classes, and the intraclass data have very high similarity. This is very applicable where no standard information signs can be identified, such as fault diagnosis. Because some system parameters, environmental interference and noise are difficult to be confirmed accurately in the real environment, it is difficult to establish an accurate model of fault diagnosis model. Based on the data driven method, it avoids the mathematical modeling of the process and can be learned through historical data when the information of the diagnosis of the object mechanism is not clear. It can learn and model through historical data to complete the fault diagnosis. Commonly used clustering algorithms are Kmeans [8], BRICH [9], EM, DBSCN [10], CLARANS [11], etc. This paper mainly studies the stability evaluation method of fault diagnosis model (EFDSE). Therefore, Clara, Kmeans, and Dbscan are directly used for fault diagnosis of motor driven end bearing (DE) based on feature clustering. These clustering algorithms are distance-based clustering and density-based clustering, respectively. The data DE is a time series, as shown in Figure 2

In the process of fault diagnosis of motor bearings, the effect of fault feature extraction determines the final diagnosis rate. The peak to average ratio (PAR), kurtosis (KURTOSIS), and bias (SKEWNESS) of the vibration data cover the distribution features, statistical characteristics, and linear characteristics of the vibration, which can effectively reflect the main characteristics of the vibration events. Therefore, this paper regards these three characteristics as the basis of fault diagnosis, specific calculation method, and specific calculation method (1)(2)(3):

: peak power

: instantaneous amplitude; : mean amplitude; : probability density; : standard deviation

Because of some eigenvectors of vibration may have certain correlation, the stability of clustering fault diagnosis models will be affected. Therefore, removing relevant eigenvectors is the first step to accomplish fault diagnosis. In this paper, principal component analysis (PCA) is used to extract unrelated feature vectors. The calculation results are shown in Figure 3. It can be seen that PAR and KURTOSIS can represent the feature vectors of DE, as shown in Table 2.

In order to effectively verify the reliability of the proposed evaluation method, this paper adopts DB (Davies-Bouldin) [12], Dunn Validity Index (DVI) [13], and Silhouette coefficient (SC) [14]. More than ten indexes of clustering evaluation are used to pre-evaluate the optimal number of DE eigenvectors, and motor bearing faults are classified into two categories. That is to say, Clara, Kmeans, and Dbscan will be used to divide the faults into two categories by means of Euclidean distance, respectively. In order to verify the effectiveness of the proposed method, the training sets T is randomly extracted from the feature sets F according to the proportion of 3:1, and then the training set T and the feature set F are divided into two categories, respectively. The results of the classification are as in Figure 4.

The first class data corresponding to the training set T and the feature set F are extracted respectively. See Figure 5.

In order to evaluate the stability of the fault diagnosis model based on feature clustering, the 2D points need to be mapped to a graphic based on a certain rule. This paper will use Dirichlet tessellation to accomplish points mapped. The Dirichlet (Delaunay) mosaic, also known as Voronoi Diagram or Thiessen Polygon, is a structure of computational geometry, which can be used for qualitative analysis, statistical analysis, and adjacent analysis [15]. In this paper, the Euclidean distance between any two points in the first category is computed. Any point is seen as the vertex of a triangle will be connected to two nearest points of Euclidean distance, and Delaunay triangulation can be obtained by N iterations. All triangle of the common point is recorded, the circle center of the triangle is found, and the clockwise connection of the center of the center is the corresponding Thiessen Polygon. The time complexity of the triangle is the complexity of the polygon. The algorithm flow is as follows.

Hypothesis: . It represents the point set P consisting of N nonrepeating points on the plane, and the specific steps of constructing the Delaunay triangulation of point set is as follows.

The points are mainly based on coordinates and are sorted by coordinates.

Step 1. The points are sorted by mainly based on coordinates.

Step 2. Structural process is as follows:(i)If , return(ii)If , three points are connected to construct a triangulation net and return(iii)The points are divided into subsets and on the basis of evenly principle or nearest neighbor principle(iv)Construct triangular net of (v)Construct triangular net of (vi) merge with and put back

Step 3. Merge process(i)For given and , calculate convex hull of and (ii)Obtain the top tangent and the bottom tangent (iii)Start from , according to left endpoint, right endpoint, and their adjacent points to complete and merge with until the is encountered.

The Voronoi diagram of the class I of T sets and F sets is shown in Figure 6.

4. EFDSE Algorithm

4.1. Edge Detection

Canny edge detection operator has obvious advantages compared with Roberts Cross operator, Prewitt operator, Sobel operator, and Kirsch operator. So in this article, edge detection is performed by Canny operator to identify its contour boundaries.

First, in order to smooth the image to reduce the obvious noise influence on the edge detector, the image adopted is the Gauss filter to check the image by Gauss filter whose size is , as follows: If a window A whose size is in the image is, pixel will be filtered. Then, after Gauss filtering, the brightness value of the pixel iswhere is a convolution symbol; : the sum of all elements in the representation matrix. The Canny algorithm uses four operators to detect the horizontal, vertical, and diagonal edges of the image. The operator of the edge detection returns the first-order value of the horizontal and vertical directions ; thus the gradient intensity and which is direction gradient of the pixels can be determined. The gradient strength of the current pixel is compared with the two pixels along the positive and negative gradient direction. If the gradient intensity of the current pixel is maximum compared to the other two pixels, the pixel is retained as the edge point; otherwise the pixel is suppressed, which is called the maximum value suppression. After exerting maximum suppression, there are still some edge pixels caused by the change of noise and color. In order to solve these stray responses, the selection of high and low threshold is established. If edge pixels are higher than the high threshold, the edge pixels are marked as strong edge pixels; if the gradient value of the edge pixels is less than the high threshold is larger than the low threshold, the edge pixels are recorded as the weak edge pixels, but if the 8 neighborhood pixels of the weak edge pixels have one strong edge pixel, they can be retained as edge points; if the edge pixels are pixels, the edge pixels can be retained as edge pixels. The gradient is less than the low threshold, and it is suppressed. Figure 6 is detection by Canny operator. Their Edge contour IS detected (see Figure 7).

4.2. Fourier Descriptor

Shape is one of the most important visual features of a target. The existing shape representation methods can be divided into two categories: shape representation based on region feature and shape representation based on contour feature. The contour based method mainly uses the pixel information of the target coverage area boundary to describe the shape [16, 17].

Fourier Descriptor is a classical contour based shape representation method. Cosgriff (1960) proposed it for the first time. The main idea is to describe the features of the contour by using a set of data that represents the overall frequency of the shape. It is invariable to the operation of rotation and translation and is the most widely used descriptor of the shape. In the aspect of algorithm research, researchers have done a lot of work in improving the shape representation based on Fourier Descriptor, in order to improve the description ability of shape. As for D Zhang and G Lu an enhanced general Fourier descriptor is proposed to extract the key part of the image content description. This method solves the shortcomings of the large number shape description which are not suitable for the generic shape description [18]. SS Li, YD Huang, and JW Yang propose a region based affine invariant ring Fourier Descriptor for affine invariant feature extraction, which can be used to extract the contour features of objects with multiple components [19]. R Kasaudhan and SH Son propose an enhanced version of grid distance Fourier Descriptor to calculate image similarity and improves image matching rate. B Belkhaoui, A Toumi, and A Khalfallah combine Fourier Descriptor with watershed (WS) algorithm to propose a process and method of automatic target recognition using inverse synthetic aperture radar image to solve the target recognition problem of radar image [20].

First, we define a continuous curve in order to explain a Fourier Expansion ( see Figure 8 ). can be expressed by

According to Euler’s formula,

If we define ,

Then, (8) can be derived bywhere and are said to Fourier Descriptor

Then, and can be derived by (8) and (9)The coefficients in (9) can be obtained by considering the orthogonal property. Thus, one way to compute values for the descriptors is

4.3. Fourier Description of the Edge Features of Fault Classification

Determining a starting point of the target boundary and moving along the counter clockwise direction at a certain speed, the boundary of the boundary point coordinates can be used to describe the boundary. The cluster boundary curve of the first class data set is defined as is the unit arc length along with boundary circle. In order to describe the outline of the image, the selected starting point needs to circle along the boundary curve. So, is a periodic function of a period in which periodic is . In order to obtain the Elliptic Fourier Descriptor of the boundary curve, Fourier series expansion is first carried out, and it can be expanded by 1D Fourier series.Then an expression of ellipse coefficient can be computed by (12) ThenAccording to the relationship between trigonometric function and exponential function, there areThen is the number of sampling points in the contour curve, it is usually the half of the number of pixels in the contour curve, and and is the value at the sample point and when they lie in . According to (16) and (17), can be regarded as the sum of complex numbers. That is,HereEquation (13) can be expressed:Then, (21) is called Elliptical Fourier DescriptionAccording to (21), the Ellipse Fourier Descriptor of the fault classification contour curve is to be obtained and is normalized, shown in Table 3.

4.4. Stability Evaluation of Fault Diagnosis Model Based on Elliptical Fourier Descriptor

Assume that the class I contour edge descriptor of the training set T and the feature set F is and , respectively. This paper shows that if the stability of the fault diagnosis model based on the feature clustering is good, and the cohesiveness of the class center is stronger. That is to say, when adding or removing the same characteristic of data to a certain class, changed degree of boundary shape of the cluster is very small and vice versa. Therefore, we define the similarity of the contour shapes of the two fault classification results defined as the stability evaluation criteria of the fault diagnosis model, as shown in represents the covariance of the two descriptors; represents the standard deviation of the descriptor vectors. The range of is . Value of is close to 1; then stability of fault diagnostic is better.

5. Experimental Results and Discussion

The clustering results are usually verified by two kinds of techniques: one is the intracluster distance such as the within.cluster.ss calculation that is the square of each internal distance. The more similar the characteristics of the data in cluster, the better the clustering effect. The other is the distance between clusters such as the average contour coefficient calculated by avg.silwidth. The larger the value, the larger the difference of the data feature of different classes and the better the data area diversity of the clustering algorithm. The within.cluster.ss index and avg.silwidth index of the three clustering results are compared with the EFDSE index in this paper (show in Table 4). As seen from the table, EFDSE indicates that diagnosis effect of Kmeans is the best, Clara is the second, and the Dbscan is the worst. It is consistent with the conclusion of avg.silwidth and within.cluster.ss, which proves that proposed EFDSE method in this paper is effective for the stability evaluation of the fault diagnosis model based on the feature clustering.

6. Conclusion

EFDSE that map the fault classification results in 2D graphic, using graphic edge detection technology. By extracting the feature vectors of the contour curves of fault classification results, the contour shape similarity is calculated to evaluate the effect of fault diagnosis. It is a new method of stability evaluation based on feature clustering fault diagnosis model. It applies similarity measurement of image to valuation of faulty diagnosis algorithm. In the case of unknown data samples and data methods, the stability of the model fault diagnosis effect is evaluated only by the visual contour feature vectors of the fault classification results. From the experimental method and principle, the evaluation is applicable to the stability evaluation of fault diagnosis models based on feature clustering. But used clustering algorithm should be distance-based clustering and density-based clustering. Making EFDSE is fit for more and more faulty diagnosis method is our work direction in future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors thank all challenge participants for their invaluable contribution. This paper is supported by National key research and development plan (2016YFC0701309) and by the National Natural Science Foundation of China (61627816).