Abstract

Aiming at solving the problem that it is difficult to recognize the quiet period of acoustic emission in rocks, four machine learning algorithms were adopted to develop and improve the recognition method of the quiet period of acoustic emission. In the process of establishing the model, the time domain data of acoustic emission were standardized and processed by box diagram method, so as to clean the abnormal data and reduce the dimension, and the frequency domain data were denoised by wavelet four-layer transform and wavelet packet three-layer energy decomposition, and a group of 8 wavelet packet energy parameters were established as frequency domain characteristic parameters. Based on AE time domain data, frequency domain data, and composite data (time-frequency domain data sets), the grid search traversal parameter technique was used to obtain the optimal parameters of four machine learning models. The accuracy, precision, recall, and score were used to verify and evaluate the recognition performance of the models. The study results show that the recognition effects of the models are good, the model accuracy of the frequency domain data set is the lowest, and the model accuracy of the composite data set is the highest, with an accuracy of more than 90%. The kernel support vector machine model has the best performance, and its average precision is 0.87. The random forest (RF) model is the best model for recognizing quiet period of acoustic emission.

1. Introduction

A physical phenomenon in which strain energy is released in the form of stress wave during deformation and failure of rock is called acoustic emission (AE) of rock. The quiet period of acoustic emission refers to the phenomenon that the acoustic emission of rock materials is rare or difficult to be observed during a certain period of time, especially before the peak stress [1], and it can be used as a precursor of rock failure and instability. Therefore, people use compression, tension, shear, and other means to study the characteristics of acoustic emission quiet period of different rocks and seek their common characteristics. Wang et al. [2] divided the evolution characteristics of acoustic emission signals into five stages by uniaxial compression experiments of layered cemented tailings backfill: quiet period, slow rising period, rapid rising period, rapid descending period, and slow descending period. The characteristic of the quiet period was that only a small amount of acoustic emission signals are distributed in the specimens at this stage. Wang et al. [3] carried out triaxial compression tests on granite samples under real-time AE monitoring. It was found that the AE signals of granite samples under triaxial compression can be divided into four stages: quiet period 1, active stage 1, quiet period 2, and active stage 2. Li et al. [4] found that granite will also enter a relatively quiet period of acoustic emission before the peak stress. Zhang et al. [5] found that saturated granite will enter the quiet period of acoustic emission later. Song et al. [6] observed that there is a quiet period of acoustic emission in red sandstone from the aspects of acoustic emission ringing count and energy, and Qiang et al. [7] found the quasi-quiet period of sandstone. Wang et al. [8] carried out an experimental study on the failure of limestone under uniaxial compression and monitored the acoustic emission characteristics of the whole limestone failure process. It was found that the relative quiet period can be used as a precursor information of rock failure. Xu et al. [9] found that the quiet period of acoustic emission becomes more obvious with the increase of confining pressure. Shu-lin and Hai-yan [10] analyzed the quiet period of acoustic emission of a large number of rocks and found that there is no quiet period of acoustic emission in copper mine and tungsten-molybdenum deposit [11]. However, the rock samples with medium pressure sudden increase in acoustic emission have the characteristics of relative quiet period of acoustic emission before the peak stress. The above studies show that the phenomena of AE quiet period have great commonness, but it is difficult to identify AE quiet period accurately and timely.

Machine learning is a cross-discipline based on statistics, mathematics, and computer science. Various machine learning models are gradually popularized and applied in the rock field. Deng et al. [12] used support vector machine (SVM), random forest (RF), and XGBoost (extreme gradient enhancement) machine learning algorithms to automatically classify slope behaviour according to the standard landslide velocity scale. Zhao and Glaser [13] verified that the machine learning algorithms including artificial neural network (ANN) and support vector machine (SVM) have higher accuracy and repeatability in relocating the source in complex media where the velocity model is unknown. Meylan et al. [14] put forward a method of real-time monitoring of gold mine weakened by electric pulse by combining acoustic emission with machine learning algorithm. Pu et al. [15] investigated the performance of 10 commonly used machine learning models for microseismic/blasting event identification. The fuzzy evaluation model comprehensively considers the performance of the model, which provides geological engineers with reliable a priori knowledge of model selection. Xinjin [16] used Gaussian classification machine learning algorithm to analyze the acoustic emission and sound signals produced by rockburst, so as to identify the role of rockburst. Zhili and Xu [17] established several sets of data to analyze, evaluate, and improve the prediction of rockburst types by introducing 9 classical machine learning algorithms. Zhaofu [18] used variational mode decomposition and sample entropy to decompose the rock fracture type and identify whether the rock failure type is split failure or tension-shear failure. He [19] designed and developed an acoustic emission detection system based on wavelet decomposition of rock fracture signals.

The occurrence of acoustic emission in engineering does not mean that disasters must occur, but disasters must be accompanied by acoustic emission, and the loss of any information may lead to inaccurate prediction. At present, the research on the quiet period of acoustic emission is mainly focused on the single category parameters in time domain or frequency domain, but due to the lack of information caused by a single data set, the recognition method of acoustic emission quiet period is not accurate enough. Therefore, in this paper, the composite (time-frequency domain) data sets are used, and the sandstone acoustic emission quiet recognition models were established based on four machine learning algorithms, and then accuracy, precision, recall, and value were used to evaluate the generalization ability of the four machine learning models.

2. Acoustic Emission Characteristics of Rock Failure Process

2.1. Introduction to the Experiment

The experimental rock samples were sandstone, showing a light yellow, fine-grained block structure, and the samples were compact and crack-free. The specification was (). The samples were numbered into two groups (A and B), with 5 in each group. Group A was composed of complete rock samples, and there was a 20 mm deep and 3 mm high prefabricated crack in the middle of the rock sample in group B. The nonparallelism of the two ends of the rock samples was less than 0.05 mm, which met the requirements of the Standard for Test Methods of Engineering Rock Mass GB/T 50266-2013. The acoustic emission system adopted the multichannel DISP acoustic emission system produced by American Physical Acoustics Corporation (PAC). The sensor type is NANO30, its resonant frequency is 140 kHz, and the dominant frequency band is 12-750 kHz. Six sensors are arranged in the upper, middle, and lower parts of the rock sample surface, as shown in Figure 1. The press is a microcomputer-controlled electrohydraulic servo pressure testing machine system, the model is YAW-1000 kN, and the maximum test force is 1000 kN. Uniaxial loading, displacement control, and loading rate of 0.002 mm/s are adopted. During the experiment, the load, stress, and strain of rock samples were recorded, and the acoustic emission ring count and energy were recorded by acoustic emission system. Acoustic emission parameter information mainly includes time, channel, rise time, ring count, energy, duration, amplitude, average frequency, effective voltage (RMS), ASL, peak frequency, inverse frequency, initial frequency, signal strength, absolute energy, center frequency, peak frequency, and waveform file.

2.2. Characteristics and Quiet Period of Acoustic Emission

As shown in Figure 2, based on the stress of rock failure, rock failure process can be divided into several stages according to different stress and strain. Compared with AE ring count and AE cumulative ring count, the whole rock failure process is divided into five stages on the basis of ring count.

Stage I is the compaction stage of pores and fractures. In the initial loading stage, the sandstone is in a low stress state. The original open structural plane or microfractures of the sample are gradually closed, and the rock is compacted. The early nonlinear deformation is formed, and the acoustic emission activity is weak. The AE event information is mainly generated by the friction between the original fractures in the rock sample, the compaction of the pore, and the fracture interface. Stage II is the later stage of pore and fracture compaction and the linear elastic stage. The AE event signal is mainly derived from the rearrangement distribution of grains in the rock sample after compression and the formation of new fractures after initial fracture compaction. Stage III is the stage of stable development of fractures. With the increase of load, the primary microcracks and new cracks in the rock sample expand steadily, and the cumulative ring count of AE increases steadily, but the amplitude is not large. Stage IV is the stage of unstable development of fractures. The axial and volumetric strain rates increase rapidly, and the expansion of microfractures changes qualitatively. The fracture expansion is fast and uncontrollable until the rock sample is completely destroyed. When the peak stress is close to about 85%, the AE events decrease significantly and enter the quiet period of acoustic emission. Stage V is the postfracture stage. After reaching the peak strength, massive fractures in the rock sample rapidly expand, connect, and penetrate, and large crack failure zones occur, resulting in a large number of AE phenomena.

It can be seen that it is difficult to recognize the quiet period of AE in real time in the experiment. Therefore, in this paper, the AE time domain data set, frequency domain data set, and the composite data set combined with time domain and frequency domain data set were corresponding to the rock failure stage. The data sets were used as the samples of machine learning, and the five stages of rock failure are used as the labels of the samples, and the labels were the target values in supervised learning. 80% of the samples were taken as the training sets and the remaining 20% samples as the test sets.

3. Acoustic Emission Data Processing

3.1. Time Domain Data Processing of Acoustic Emission Signal

Firstly, Python was used to sort the AE data in time domain by a traversal method. The outliers of acoustic emission are inevitable in the process of signal acquisition, and substituting the abnormal values into the process of data calculation and analysis will have unpredictable consequences to the final result. In this paper, the method of box diagram was used to remove the outliers, and the ring count of sample no. 2 was taken as an example to calculate the box diagram. The distribution of the original data of ring count is shown in Figure 3(a). A total of 172328 ring counts are produced, most of them are distributed between 0 and 10, and the outliers are greater than 11. After removing the outliers, the histogram is redrawn as shown in Figure 3(b).

The normalization of AE data is the basis of data processing, because there are many time domain parameters of acoustic emission, but different parameters have different dimensions. If not processed, it will increase the difficulty of the computer in machine learning and may affect the final results. In order to eliminate the difference in dimension and value range among the parameters, Python was used to normalize the AE data. There are two common methods of normalization: deviation standardization and standard deviation standardization. The deviation normalization does not change the relationship between the data and is the simplest way to eliminate the dimension and reduce the influence of the data range. The results of some AE data before and after deviation standardization are shown in Table 1(a) and Table 1(b).

3.2. Frequency Domain Data Processing of Acoustic Emission Signal

In the process of the experiment, it is necessary to reduce the noise of the acoustic emission signal due to the influence of environmental factors and the vibration of experimental equipment. Wavelet transform has a good noise reduction effect on acoustic emission signals. Considering the sudden occurrence of rock acoustic emission signals and the dense occurrence of acoustic emission signals in a short time of rock failure, db3 wavelet basis was used to decompose the acoustic emission original signal into four layers. db3 wavelet basis can remove the noise signal to the maximum extent, and the wavelet decomposition coefficient is processed by soft threshold. The decomposed signal is reconstructed by threshold and finally restored.

Wavelet packet decomposition is a further optimization of wavelet transform. The number of wavelet packet decomposition layers is determined according to the signal acquisition frequency. The acquisition frequency was set to 1 MHz. The Nyquist frequency was calculated to be 500 kHz. The signals were decomposed into three layers of wavelet packets, then the frequency band range is divided into stages, and the corresponding bandwidth of each node after decomposition is .

After the acoustic emission signals were decomposed by wavelet packet, the frequency bands obtained by the last layer decomposition are sorted according by Gray code, so it is necessary to reorder them according to the natural increasing order, and the results are shown in Table 2.

In order to convert the wavelet packet energy into experimental data that could be used for machine learning, and to make the wavelet packet energy as the characteristic information that could represent the acoustic emission frequency domain, in order to ensure the integrity of each waveform signal without splitting its eigenvector, it was necessary to convert the acoustic emission signal waveform into the energy ratio of the frequency band. The integrity of the frequency domain eigenvector was guaranteed according to the ratio of 8 frequency bands to the total frequency band energy. The ratio of the wavelet packet energy of each frequency band of the denoised signal to the signal after wavelet packet transform is shown in Table 3, and each row in the table corresponds to the AE frequency domain signal of one time node. The ratio of each subband in the total energy is calculated in turn, as shown in Table 4.

According to Tables 3 and 4, the band of acoustic emission energy produced by uniaxial compression of sandstone is mainly concentrated between 62.5 kHz and 375 kHz, accounting for 96% of all energy, frequency band which is less than 62.5 kHz accounting for 1%, which is greater than 375 kHz accounting for 0.71%, 62.5 kHz~125 kHz accounting for 8.9%, 62.5 kHz~125 kHz accounting for 8.9%, 125 kHz~187 kHz accounting for 21.5%, 187.5 kHz~250 kHz accounting for 25.5%, 250 kHz~312.5 kHz accounting for 10.4%, and 312.5 kHz~375 kHz accounting for 29.7%. The above analysis shows that although the acoustic emission signals of sandstone are widely distributed, most of them are concentrated in the low- and medium-frequency bands.

4. Accuracy Analysis and Recognition Ability Evaluation of Machine Learning Models

Based on the basic algorithm package of scikit-learn machine learning, according to the characteristics of AE data, four supervised learning algorithms were used to learn AE data sets, including -nearest neighbor (KNN), random forest (RF), gradient boosting decision tree (GBDT), and kernel support vector machine (KSVM). Although there are many machine learning algorithms suitable for binary classification, in view of the large amount and variety of acoustic emission data, the four machine learning algorithms used in this paper are the most commonly used classification algorithms, and the parameter adjustment in the modelling process is relatively easy, which is more suitable for acoustic emission data.

In order to evaluate the generalization performance of the four models, it was necessary to use the evaluation index to measure the generalization ability of the models. Accuracy, precision, recall, and score (the harmonic mean of precision and recall) are commonly used model performance evaluation indexes in classification problems. Among them, the accuracy reflects the overall classification performance of the model, and the other indexes reflect the classification ability of the classifier for different types of samples. Accuracy is the most common evaluation index. Generally speaking, the higher the accuracy, the better the classifier. The accuracy can only reflect the overall prediction degree of the model; therefore, it is necessary to use the accuracy, recall, and score to evaluate the recognition ability of the model in the quiet period of rock failure AE after evaluating the accuracy.

4.1. Analysis of the Accuracy of Machine Learning Models

Firstly, the accuracy of the -nearest neighbor machine learning models using acoustic emission time domain, frequency domain, and time-frequency domain composite data sets was evaluated. The _neighbors interval is 1-20. The results are shown in Figure 4(a). As can be seen from Figure 4(a), for the -nearest neighbor model, the accuracy of the model using frequency domain data set is the lowest, and the accuracy of the model using composite data set is the highest, and the accuracy is the highest around the parameter . The maximum accuracy of the model of the composite data set is 0.89, indicating that nearly 90% of the data can be correctly recognized by the -nearest neighbor model.

When applying the random forest model, the _estimators interval is 10-200. The results are shown in Figure 4(b). As can be seen from Figure 4(b), the accuracy of the model using frequency domain data set is the lowest, the model using composite data set is the highest, and the accuracy is the highest around the parameter . The maximum accuracy of the model of the composite data set is 0.91, indicating that more than 90% of the data can be correctly recognized by the random forest model.

When using the gradient boosting decision tree model, the _estimators interval is 10-200. The results are shown in Figure 4(c). As can be seen from Figure 4(c), the accuracy of the model using frequency domain data set is the lowest, the accuracy of the model using composite data set is the highest, and the accuracy is the highest around the parameter . The maximum accuracy of the model of the composite data set is 0.90, indicating that 90% of the data can be correctly recognized by the gradient boosting decision tree model.

When using the kernel support vector machine model, the adjustment parameter gamma interval is 0.001-10, and the regularization parameter is 1. The results are shown in Figure 4(d). As can be seen from Figure 4(d), the accuracy of the model using frequency domain data set is the lowest, the model using compound data set is the highest, and the accuracy is the highest around the parameter . The maximum accuracy of the model of the composite data set is 0.91, indicating that more than 90% of the data can be correctly recognized by the kernel support vector machine model.

The accuracy of 12 sets of data with 4 machine learning algorithms and 3 data sets is shown in Table 5. From the point of view of the data set, the accuracy of the model using composite data set is the highest, between 0.89 and 0.91, with a mean of 0.90. The accuracy of the model using frequency domain data set is the lowest, between 0.7 and 0.82, with a mean of 0.77. The accuracy of time domain data sets ranges from 0.81 to 0.88, with a mean of 0.84. From the perspective of machine learning algorithm, the accuracy of kernel support vector machine is the highest, with a mean of 0.87, followed by gradient boosting decision tree, whose average accuracy is 0.84. The average accuracy of -nearest neighbor is 0.83, and that of random forest is 0.81. Thus, it can be seen that no matter from the algorithm or from the data set, the four machine learning models can accurately recognize the AE quiet period.

4.2. Evaluation of the Ability to Recognize the Quiet Period of Acoustic Emission in Rock Failure

In a sense, we can judge whether a model is effective by accuracy, but for unbalanced data sets or data sets with unbalanced categories, if a special sample occupies the majority of the whole, it will affect the judgment of the model on the integrity. The accuracy can only reflect the overall prediction degree of the model; therefore, it is necessary to use the accuracy, recall, and score to evaluate the recognition ability of the model in the AE quiet period of rock failure after evaluating the accuracy. Firstly, the grid search method was used to select the optimal parameters of the machine learning models, and then, the acoustic emission quiet period recognition ability of four machine learning models was evaluated.

The KNN model needs to select the appropriate number of neighbors , and there will be different decision boundaries according to the different values of . The data set was divided into training set and test set, and different -value models were established. The best model was selected to evaluate the accuracy, recall, and score of AE time domain, frequency domain, and composite data set. The results are shown in Figure 5(a).

With the increase of number of neighbors, the changes of the accuracy of the model using time domain, frequency domain, and composite domain are similar, the precision decreases when the number of neighbors is 2 and then slowly increases to the highest value when number of neighbors is 14, and the highest precision using composite data is 0.7, and the recall and score of the composite data model are higher than those of the time domain model and the frequency domain model, the precision of the time domain model can reach 0.5, and the recall and score of the composite data model are higher than those of the frequency domain model. The three indexes of the frequency domain model are the lowest, and the precision is only about 0.5. If the -nearest neighbor model chooses the composite data set, it can ensure that there is a 70% probability to correctly recognize the acoustic emission quiet period. According to the characteristics of acoustic emission, a higher recall should be selected in the case of ensuring the precision of classification, and after comprehensive consideration, the KNN model using composite data is selected when the number of neighbors is 14.

The machine learning models of acoustic emission data in time domain, frequency domain, and composite data are established by using RF, and the machine learning models with _estimators from 10 to 200 are established, respectively, and the performance indexes are shown in Figure 5(b). The precision of the composite data model is the highest, about 0.7, and the recall of the composite data model can reach 0.5. The precision of the time domain model is about 0.55, and the recall rate of the time domain model is 0.35. The accuracy of the frequency domain model is about 0.45, and the recall rate of the frequency domain model is only about 0.15. The composite data model has the highest precision, followed by the time domain model, and finally the frequency domain model. The most suitable random forest model was selected by comprehensively considering the precision rate and recall rate. Therefore, when _estimators is 140, the random forest model using the composite data was established.

In order to select the most suitable parameters of the gradient boosting decision tree, the composite data set was used to control the parameters _estimators, learning_rate, and max_depth, respectively, and then, the appropriate parameters were selected.

Through Figure 5(c), it can be found that the precision of the composite data model can be stable at about 0.7. When _estimators is 50, the precision can reach 0.7, and the recall can reach 0.52. However, in the same case, the performance of the time domain and frequency domain model is not as good as that of the composite data model. The precision and recall of the composite data model are better than those of the time domain model and the frequency domain model, indicating that the combination of time domain data and frequency domain data plays an important role in the construction of the model. Therefore, the performance of the composite data model when _estimators is 50 is the best, and when _estimators is 50, learning_rate is 0.1, and max_depth is 8, the GBDT model using composite data was established.

As shown in Figure 5(d), the kernel function of kernel support vector machine is radial basis function (RBF). After setting to 1, the proper gamma is found between 0.001 and 10, and the best model is selected by combining with several data sets. On the whole, with the increase of gamma value, the precision of the three models using different data sets shows a trend of rising at first and then decreasing, and the precision of the time domain model is similar to that of the composite data model. However, it is better than the frequency domain model, in which the highest precision of the time domain model is 0.76. But compared with the recall and score of the composite data model, it is not a good choice. Therefore, the parameter gamma was set as 0.1, and was set as 1, and the composite data set was used as a sample to establish the KSVM model.

Combined with four machine learning algorithms, the machine learning models of three kinds of data sets were built, and the classification performance of the models was evaluated based on accuracy, precision, recall, and score. The accuracy reflected the overall accurate degree of the classification of the five stages of rock failure process. For the study of the quiet period of AE, through comprehensive comparison, it was found that the model based on composite data set had the best recognition effect, followed by the model based on time domain data set, and finally the model based on frequency domain data set. The generalization performance indexes of the models based on the four machine learning algorithms using composite data sets are compared, as shown in Table 6.

It can be seen from Table 6 that the precision of the four machine learning models is between 0.68 and 0.71, with little difference. Among them, the precision of the KSVM model is the lowest, only 0.68. The precision of the KNN model is the same as that of the GBDT model, both of which are 0.7. The precision of the RF model is the highest, which is 0.71. In terms of recall, the recall of the four models are between 0.37 and 0.52. The recall of the KNN model is the lowest, which is 0.37. The recall of the GBDT model is the highest, which is 0.52. The recall of the RF model is 0.5. In terms of score, the score of the four models is between 0.48 and 0.6, and the value of the KNN model is the lowest, which is 0.48. The score of the KSVM model is slightly higher than that of the KNN model, and there is little difference between the RF model and GBDT model.

The purpose of this paper was that as long as the quiet period of acoustic emission occurs, the machine learning models need to recognize it. Therefore, the final evaluation criterion of the model was to compare its indexes under the condition of ensuring precision. In terms of precision, recall rate, and score, the best model is the RF model, followed by the GBDT model and KNN model, and finally, the KSVM model.

5. Conclusions

Acoustic emission is an important means to monitor and predict rock dynamic disasters such as rockburst and underground impact pressure, and the quiet period of acoustic emission is an important precursor for prediction. In order to accurately recognize the quiet period of acoustic emission, this paper took sandstone as the research object and studied the quiet period recognition of sandstone acoustic emission based on four machine learning algorithms. The conclusions are as follows: (1)Acoustic emission time domain data set and frequency domain data set can be used alone or combined into a composite data set to establish the acoustic emission quiet period recognition model(2)The model accuracy of the acoustic emission quiet period recognition established in this paper is high, the model accuracy using the frequency domain data set is the lowest, and the model accuracy of the composite data set is the highest, and the accuracy is more than 90%. From the point of view of machine learning algorithm, the KSVM model performs best, and its average accuracy is 0.87(3)The precision of the four machine learning models is between 0.68 and 0.71. Among them, the precision of the KSVM model is the lowest, only 0.68. The precision of the KNN model and the GBDT model is the same, both of which are 0.7; the RF model has the highest precision of 0.71. Considering the precision, recall, and score, the best model for acoustic emission quiet period recognition is the RF model, followed by the KNN model, the GBDT model, and finally the KSVM model

Data Availability

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (No. 51604184) [¥200000].