Abstract

Measures of predictability in physiological signals based on entropy metrics have been widely used in the application domain of medical assessment and clinical diagnosis. In this paper, we propose a new entropy-based pattern learning by a combination of singular spectrum analysis (SSA) and entropy measures for assessment of physiological signals. Physiological signals are first represented as a series of SSA components, and then well-established entropy measures are extracted from the resulting SSA components that can help to facilitate the features extraction from physiological signals. The entropy measures of notable SSA components are used to form input features and fed into pattern classifier. To demonstrate its validity, applicability, and versatility, the proposed entropy-based pattern learning is used to perform medical assessments with three kinds of classical physiological signals, that is, electroencephalogram (EEG), electromyogram (EMG), and RR-interval signals. Experiments demonstrate that in all cases, the proposed entropy-based pattern learning can effectively capture specific biosignal patterns of physiological signals and achieve excellent identification performances for the assessments of EEG, EMG, and RR-interval signals. Besides, through the comparison of the identification performances for entropy-based pattern learning based on the physiological signals themselves and the SSA components, it is concluded that the discriminating power of entropy-based pattern learning based on the SSA components is much stronger than that based on the physiological signals themselves. Since it can be easily extended to any other physiological signal analysis, the proposed entropy-based pattern learning may use as an efficient approach to reveal biosignal patterns for medical assessment of physiological signals.

1. Introduction

Physiological signal is an invaluable data source, which can be utilized for examining the functioning of the human body [1, 2]. With the help of data mining and machine learning techniques, physiological signals are already widely used to assess cognitive states [3], monitor psychological functions [4], diagnose human diseases [57], and so on. Driven by the strong demand on the practical applications, increasing attention has been paid to using machine learning methods for analysis and assessment of physiological signals in recent years [8]. For instance, Narula et al. demonstrated an application case which used three machine learning algorithms (support vector machines (SVM), random forests and artificial neural networks) and echocardiographic data to automatically discriminate the hypertrophic cardiomyopathy from the physiological hypertrophy in athletes [8]. Owing to the remarkable progress of machine learning, the unprecedented accuracy for pattern learning has been achieved. Some intelligent algorithms even can do better than a qualified doctor in the clinical diagnosis of diseases such as the detection of skin cancer [9]. Undoubtedly, pattern learning for the analysis and forecasting of physiological signals can be viewed as a promising avenue for healthcare applications based on physiological signals [10, 11].

Nowadays, entropy-based pattern learning has been proved to have strong power to reveal the intrinsic features in physiological signals. Entropy not only can be used to measure the additional information needed to determine the state of a system, but also can be used to quantify the irregular, random and chaotic behavior of physiological signals [12]. There have been reported many application cases of entropy-based pattern learning by the use of a variety of entropy measures (such as approximate entropy (ApEn), sample entropy (SampEn), permutation entropy, spectral entropy, short-term Rényi entropy and Shannon entropy, and so on) [1321]. For instance, Raghu et al. proposed a novel minimum variance modified fuzzy entropy to identify epileptic seizures in real time from electroencephalogram (EEG) signals, which achieved the classification accuracy of 100% [22]. As a promising approach for medical assessment of physiological signals, the entropy-based pattern learning can often achieve very good performance for the detection of physiological status and activity or even medical diagnosis. By reviewing the available literature, there are main three factors that can affect the performance of entropy-based pattern learning for assessment of physiological signals.(i)First of all, it is well-known that machine learning algorithms have great influence on the performance on entropy-based pattern learning tasks. For instance, Acharya et al. explored seven different machine learning algorithms to investigate the automated diagnosis of epileptic EEG using entropies [23]. Recently, some scholars even tried to employ deep learning to analyze physiological signals [24].(ii)Secondly, the types of entropy measures also play an important role in the performance on specific pattern learning tasks. As is well known, different types of entropy measures can capture different features from physiological signals [25]. For instance, Li et al. investigated nine entropy measures of EEG signals for emotion recognition [26].(iii)Thirdly, how to deal with physiological signals in the interest of effectively extracting entropy measures from them has been becoming one of the most key factors that determine the performance on entropy-based pattern learning tasks. The existing studies have shown that using the entropy-based pattern learning for assessment of physiological signals, the feature extraction of entropy measures depends heavily on the decomposition and representation methods of physiological signals [2739].

Many advanced methods, such as discrete wavelet transform [35, 39], wavelet packet decomposition [36, 37], empirical mode decomposition (EMD) [38], ensemble empirical mode decomposition [28], etc., have been introduced to deal with physiological signals with the aim of effectively extracting entropy measures from them. Table 1 summarizes the existing various decomposition and representation methods for entropy-based pattern learning in the interest of effectively extracting entropy measures from physiological signals. In these practical applications, researchers find the facts that when using entropy-based pattern learning for assessment of physiological signals, the decomposition and representation of physiological signals is a critical step for the feature extraction of entropy measures, which can allow specific intrinsic features to be extracted from physiological signals. For instance, Sharma et al. proposed a framework for entropy-based pattern learning based on EMD to identify focal and nonfocal EEG signals [38]. The EEG signals were first represented as the intrinsic mode functions (IMFs) by the use of the EMD method. Then five entropy measures, i.e., Shannon entropy, Renyi’s entropy, ApEn, SampEn, and phase entropy, were calculated from different IMFs. The SVM was used as a pattern classifier to achieve identifying the focal and nonfocal EEG signals. Similarly, Gupta et al. also presented a framework for entropy-based pattern learning based on flexible analytic wavelet transform (FAWT) to detect the focal EEG signals [29]. The EEG signals were first represented as 15 levels of FAWT. Three entropy metrics, i.e., cross correntropy, Stein’s unbiased risk estimate entropy and log energy entropy, were calculated from the sub-band signals and reconstructed the original signal. The -nearest neighbor and SVM were used as pattern classifiers to perform the automatic diagnosis of focal EEG signals. In view of the recent advances on the entropy-based pattern learning, it clearly shows that how to deal with the physiological signals for the purpose of obtaining the optimum performance on specific pattern learning task is one of the critical practical issues of entropy-based pattern learning in the current research communities.

As we can see in Table 1, exploring effective decomposition and representation methods of physiological signals to facilitate the feature extraction of entropy measures is a rich research area. In our study, we develop a novel feature engineering approach based on singular spectrum analysis (SSA) to decompose physiological signals for entropy-based pattern learning, which is different from the available approaches for entropy-based pattern learning to deal with physiological signals. The novelty of the proposed entropy-based pattern learning is that the combination of SSA and entropy measures can help to facilitate the feature extraction of entropy measures from physiological signals. In the interest of effectively extracting entropy measures, physiological signals are first represented as a series of SSA components and then extract entropy measures from the resulting SSA components. The SSA components are capable of facilitating the feature extraction of entropy measures from physiological signals. To summarize, the main contributions in our work are described to be: (1) An innovative entropy-based pattern learning is proposed for assessment of physiological signals, which is based on a combination strategy of SSA and entropy measures. (2) The effect of entropy measures of SSA components for assessment of physiological signals is analyzed with three kinds of classical physiological signals (EEG, electromyogram (EMG) and RR-interval signals) associating with specific physiological states and biosignal patterns. (3) The performance of the proposed entropy-based pattern learning for the assessment of EEG, EMG, and RR-interval signals is investigated extensively and the validity, applicability, and versatility of the proposed entropy-based pattern learning are demonstrated.

This paper is organized according to the following structure. Section 2 introduces the experimental data used in our work. Section 3 describes in detail the proposed entropy-based pattern learning, including that the physiological signals with the representation of SSA components, entropy measures, pattern classifier, and performance evaluation. Section 4 presents the experimental results of using the proposed entropy-based pattern learning for the assessment of EEG, EMG, and RR-interval signals and gives some comparison results with the existing studies. Section 5 discusses some key problems for the proposed entropy learning and highlights the advantages and disadvantages. Section 6 makes a conclusion about our work.

2. Materials

In this paper, based on three kinds of typical physiological signals, EEG, EMG, and RR-interval signals, the proposed entropy-based pattern learning was applied to perform experiment evaluations for the identification of specific biosignal patterns in these physiological signals so as to demonstrate its validity, applicability and versatility.

2.1. EEG Signals

The EEG signals used in this paper were available online and provided by the department of epileptology, University of Bonn, Germany [40, 41]. The open EEG dataset consisted of five subsets (datasets S, F, N, O, and Z). The datasets O and Z were collected from surface EEG recordings using a standardized electrode placement scheme according to the international 10–20 system. The datasets N, F, and S originated from EEG archive of presurgical diagnosis, which were selected from all recording sites exhibiting ictal activity. Each of the EEG datasets included 100 segments and the duration of each segment was 23.6 seconds corresponding to 4097 sample points with the sampling frequency 173.61 Hz. It should be noted that each segment of datasets S, F, N, O, and Z were all single-channel EEG signals, which were selected and cut out from continuous multichannel EEG recordings after visual inspection for artifacts, e.g., due to muscle activity or eye movements. For more in-depth information, please refer to the reference [41].

In this paper, we used the EEG signals of datasets O and Z to demonstrate the proposed entropy-based pattern learning for the identification of the eye-open and eye-closed states. The EEG signals of datasets O and Z were recorded from five healthy volunteers who were relaxed in an awake state with their eyes closed and open in the course of the experiment, respectively. There were a total of 200 samples available and the numbers of samples of eye-open and eye-closed states were both 100.

2.2. EMG Signals

The EMG signals used in this paper were obtained from the Physiological Action Data Set (PADS) in the UCI Machine Learning Repository [42]. The EMG signals were recorded from 4 subjects (age 25–30; one female) who were instructed to perform ten normal and ten aggressive physical actions (the ten normal actions were: (1) bowing; (2) clapping; (3) handshaking; (4) hugging; (5) jumping; (6) running; (7) seating; (8) standing; (9) walking; and (10) waving, respectively, and the ten aggressive actions include: (1) elbowing; (2) front kicking; (3) hammering; (4) heading; (5) kneeing; (6) pulling; (7) punching; (8) pushing; (9) side kicking; and (10) slapping, respectively). The EMG signals were 8-channel recordings recorded by the Delsys EMG apparatus using eight skin-surface electrodes placed on the upper arms (biceps and triceps), and upper legs (thighs and hamstrings). Each contained about 10000 samples. For more details about the PADS database, please refer to the online information at https://archive.ics.uci.edu/ml/datasets [42].

In this paper, we used the EMG signals with the first 5000 data points of the first channel (corresponding to the right bicep EMG signals) to demonstrate the proposed entropy-based pattern learning for the identification of the normal and aggressive physical actions. There were a total of 80 samples available and the numbers of samples of normal and aggressive physical actions were both 40.

2.3. Interbeat Intervals

The interbeat intervals (RR-interval signals, the abbreviation of RR referred to the interval between two successive peaks of electroencephalogram (ECG) signals) used in this paper were obtained from two RR interval databases found at https://physionet.org/physiobank, one was the normal sinus rhythm (NSR) RR interval database [43] and the other was the congestive heart failure (CHF) RR interval database [44]. The NSR RR interval database was recorded from 54 subjects (30 men, aged 28.5–76, and 24 women, aged 58–73) with the NSR-heart state. The CHF RR interval database was recorded from 29 subjects aged 34–79, with congestive heart failure (NYHA classes I, II, and III). Subjects included 8 men and 2 women; gender was not known for the remaining 21 subjects. The original ECG recordings for both NSR and CHF RR interval databases were digitized at 128 Hz and the RR-interval signals were obtained by automated analysis with manual review and correction. For more in-depth information, please refer to the online information at https://physionet.org/physiobank.

In this paper, we used the first one hour of the RR-interval signals to demonstrate the entropy-based pattern learning for the identification of the NSR-heart and CHF-heart states. There were a total of 83 samples available and the numbers of samples of the NSR-heart and CHF-heart states were 54 and 29, respectively.

3. Methods

This paper proposes a new entropy-based pattern learning for assessment of physiological signals based on a combination strategy of SSA and entropy measures. For the proposed entropy-based pattern learning, the main innovation is to employ SSA components to facilitate the feature extraction of entropy measures from physiological signals. Figure 1 shows the block diagram of the proposed entropy-based pattern learning by a combination of SSA and entropy measures for assessment of physiological signals. As shown in Figure 1, physiological signals are first represented as SSA components by SSA method. After that, well-established entropy measures are calculated from the resulting SSA components. The entropy measures of notable SSA components are selected to form input features and fed into pattern classifier. For a more detailed description of the proposed entropy-based pattern learning is introduced as follows.

3.1. Singular Spectrum Analysis (SSA)

SSA has already become a powerful technique for the analysis and forecasting of time series, which is essentially a model-free method based on principal component analysis [45, 46]. SSA to analyze time series incorporates the elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems, and signal processing, which can overcome many limitations such as nonlinearity and nonstationarity of signals [47, 48]. As a model-free method for time series analysis, the SSA method can be applied to arbitrary time series including nonstationary time series [49]. SSA usually involves two complementary stages, one is the stage of decomposition and the other is the stage of reconstruction [46]. The stage of decomposition consists of two steps: Embedding and Singular Value Decomposition (SVD). Consider a time series for . The first step, Embedding, is to achieve mapping the one-dimensional time series into the multi-dimensional vector space based on phase space reconstruction with window length , and the trajectory matrix of time series can be obtained as follows.

where

The second step is to perform the SVD of trajectory matrix so that the trajectory matrix can be further decomposed into the sum of rank-one elementary matrices.

where and are the eigenvalues and eigenvectors of the covariance matrix , respectively, is defined to be . The vectors and for are also called the left and right singular vectors, respectively. The collections (, , ) for are referred to as the -th eigentriple of the SVD of trajectory matrix . It must be noted that the eigenvalues of are given in descending order of magnitude (i.e., ).

The stage of reconstruction also consists of two steps: Grouping and Diagonal Averaging. The indices 1, 2, , are corresponding to the ordinal numbers of the eigentriple of SVD of trajectory matrix . In the step of Grouping, the set of indices {1, 2, , } can be first divided into v disjoint subsets (), and then the matrices () can be obtained by the eigentriples of the corresponding ordinal numbers in the forms of (2) and (3). Finally, the resulting matrix can be expressed as

In the step of Diagonal Averaging, because the size of the resulting matrix is , the resulting matrix first can be described in the form with elements, i.e., , and .

By averaging of the matrix elements over the diagonals , a new one-dimensional time series for , can be reconstructed from the resulting matrix . The expression of the time series can be derived as follows [46].

The reconstructed time series is corresponding with the resulting matrix one by one. When in the step of Grouping, the resulting matrix in Equation (4) will be equal to the trajectory matrix in Equation (3). Under such condition, the original time series can be reconstructed by the resulting matrix , that is, . Furthermore, when the indices of are chosen to 1, 2 , , in turn to realize the matrix and the resulting matrix is defined to be the , the corresponding component can be reconstructed. Hence, in these conditions the original time series can be expressed as the sum of all components .

SSA is an important signal decomposition method based on principal component analysis, which can decompose the original time series into the sum of a small number of interpretable components [46]. From the perspective of SSA components analysis, the components can be viewed as the SSA components of the original time series. Because the decomposed components are data adaptive, the SSA components are very suitable for the dynamic analysis of multicomponential and nonlinear physiological signals [46].

3.2. Physiological Signals with the Representation of SSA Components

As the analysis of time series above, physiological signals (where ) can be first mapped to the trajectory matrix with window length . Then, each of the eigentriples of the SVD of trajectory matrix are used to reconstruct its corresponding SSA components which are denoted as (where ) [45]. For instance, the first SSA component is reconstructed by the first eigentriple of the SVD of trajectory matrix; the second SSA component is reconstructed by the second eigentriple of the SVD of trajectory matrix, and so on. Finally, the physiological signals (where ) can be represented as the sum of all SSA components (where ).

3.3. Entropy Measures

A variety of well-established entropy measures, such as ApEn, SampEn, MSEn, etc., have been proposed to quantify the amount of regularity or irregularity of physiological signals that are collected from subjects representing the dynamic physiological processes of human body system [5052]. ApEn is a complexity measure of physiological signals, which can be used for the measure of predictability based on assessing the irregularity of physiological signals. SampEn, as a modified version of ApEn, was proposed by Richman et al. [51], which had better performance than ApEn in the consistency and dependence on data length [51]. FuzzyEn was proposed by Chen et al. [52], which was developed on the basis of SampEn and used the fuzzy membership function (usually the family of exponential function) to make a fuzzy measurement of two vectors’ similarity so that the similarity does not change abruptly [52]. MSEn, proposed by Costa et al. [12], was based on the evaluation of SampEn on the multiple time scales. MSEn used as entropy metric in physiologic signals which took into account the multiple time scales. The original signal was divided into nonoverlapping segments by the use of coarse-graining, and then the averages of the data points for each segments were reconstituted the coarse-grained signal.

We used ApEn, SampEn, FuzzyEn, and MSEn as entropy measures to extract the features from the SSA components derived from physiological signals. For ApEn, SampEn, and FuzzyEn, the typical calculation parameters were that or 3, and to 0.25 × STD, where STD was the standard deviation of time series. In this paper, the assignments of the calculation parameters for ApEn, SampEn, CmpMSE, and PermEn are shown in Table 2.

3.4. Screening Notable SSA Components

Screening notable SSA components from the resulting SSA components can be viewed as a step of feature dimension reduction. The paired-sample -test and effect size of Cohen’s have been applied extensively in the existing researches [35, 53]. In our work, when evaluating the performance of the proposed entropy-based pattern learning, the paired-sample -test and effect size of Cohen’s are first used to screen the notable SSA components in the training phase.

For the entropy measures of the SSA components, the values of paired-sample -test obtained from the entropy measures of two sample groups are used to rank the SSA components in the training phase. The lower the values are, the more discriminative are the entropy measures of the SSA components. In this paper, the statistical significance is set a priori at , and when the values are less than 0.05, the SSA components will be recognized as the notable SSA components. In addition, Cohen’s static is calculated for statistically significant observations to examine the effect size of the corresponding measure for two sample groups. Cohen’s is particularly popular in meta-analysis in which the difference on the two means for two sample groups is considered important [53]. An effect size is considered medium and is consider large if , as suggested in the available researches [53, 54]. In this paper, the magnitude of is used to indicate the notable SSA components for pattern learning.

3.5. Pattern Classifier and Cross-Validation

Machine learning can learn from data to make data-driven decisions. There are many powerful machine learning algorithms to use for pattern classifier. In this paper, we have use two classical machine learning algorithms as pattern classifiers, i.e., SVM and linear discriminant analysis (LDA). The main working principle of SVM is to map the input space to a high dimensional feature space and then uses a kernel function to determine a linear optimal separating hyperplane in the feature space. LDA, based on the within and between class scatter matrices, can achieve a linear decision boundary by maximizing the between class scatter and minimizing the within class scatter. In our work, the SVM is implemented with the LIBSVM toolbox [55], and the LDA is implemented by the FITCDISCR function of Matlab. Table 3 presents the details of parameters of SVM and LDA pattern classifiers used in our work. For SVM, the type of SVM is selected as C-SVC () and the radial basis function (RBF) is used as the kernel function (). We search the parameter space 2[−10:8] with a step of one to find the optimal values for the parameters c and g and the other parameters are used the default values according to the recommendation by LIBSVM toolbox instructions [36, 56]. For LDA, the discriminant type of FITCDISCR function is set to linear, which is corresponding to the regularized linear discriminant analysis.

There are two types of cross-validation for machine learning, one is exhaustive cross-validation and the other is nonexhaustive cross-validation, the aim of which is to divide the whole samples into a training set and a test set. In this paper, leave-one-out cross-validation (LOOCV) and five-fold cross-validation (FFCV) were used as validation strategies to evaluate the performance for entropy-based pattern learning. The LOOCV, as an exhaustive cross-validation method, uses each one sample as the test set exactly once, while the remaining samples are used as the training set before testing. The FFCV, as a nonexhaustive cross-validation method, randomly partitions the whole samples into five subsets [57]. Each subset is used as the test set exactly once, while the remaining subsets are used as the training set before testing. The results of all tests’ performance evaluation are averaged to gain the final statistical measures.

3.6. Performance Evaluation

Accuracy, Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) are usually used as the statistical measures to evaluate the performance of pattern recognition [35, 38]. The mathematical expressions of these statistical measures are given by

where is corresponding to the number of true positive instances identified as positive instances, is corresponding to the number of true negative instances identified as negative instances, is corresponding to the number of true negative instances identified as positive instances and is corresponding to the number of true positive instances identified as negative instances.

4. Results

4.1. Eye States Identification from EEG Signals

To demonstrate its validity, the proposed entropy-based pattern learning was used to identify the eye-closed and eye-open states from EEG signals. In this experiment, the EEG signals were first represented as the ten SSA components by the use of SSA with window length Then, four well-established entropy measures, ApEn, SampEn, MSEn, and FuzzyEn, were extracted from the ten SSA components as the features to quantify the EEG signals under the eye-closed and eye-open physiological states. The calculation parameters for the four well-established entropy measures were set as shown in Table 2.

Figure 2 shows the statistical results of the values of ApEn, SampEn, MSEn, and FuzzyEn for the first to tenth SSA components of EEG signals under the eye-closed and eye-open physiological states, which are given as means ± standard erssrors. The order of SSA components from 1st to 10th in Figure 2 denotes the first to tenth SSA components of EEG signals, which is corresponding to the descending order of magnitude of the ten eignvalues according to the results of the SSA of EEG signals. For each of the SSA components of EEG signals, the paired-sample -test is used to test the significance of the entropy measures between the eye-closed state and the eye-open state. The values of paired-sample -test for ApEn, SampEn, MSEn, and FuzzyEn of the first to tenth SSA components between the eye-closed state and eye-open state are shown in Table 4.

In Table 4, the smaller the values are, the more significant are the entropy measures of the SSA components between the eye-closed state and eye-open state. According to the ascending order of the values of paired-sample -test, for ApEn and SampEn, the first to tenth SSA components of EEG signals can be ranked as 1, 5, 10, 9, 8, 7, 6, 3, 4, and 2, respectively. In the same way, for FuzzyEn the first to tenth SSA components of EEG signals can be ranked 1, 8, 3, 5, 9, 10, 4, 7, 6, and 2, respectively; for MSEn, the first to tenth SSA components of EEG signals can be ranked 3, 2, 6, 4, 9, 7, 8, 1, 10, and 5, respectively. In addition, for most SSA components of ApEn, SampEn, MSEn, and FuzzyEn, the values are less than 0.05, except for the ApEn of the 3rd SSA component (), except for SampEn of the 3rd and 4th SSA components ( and 0.0531, respectively), except for FuzzyEn of the 5th and 6th SSA components ( and 0.767, respectively) and except for MSEn of the 9th SSA component ().

Next, we used SVM as pattern classifier and LOOCV as cross-validation strategy to evaluate the performance for the proposed entropy-based pattern learning with aim to identify the eye-closed and open physiological states from EEG signals. First of all, we need to screen the notable SSA components in training phase. When the values are less than 0.05, the SSA components would be recognized as the notable SSA components. Therefore, taking advantage of the feature dimension reduction by paired-sample -test, only the entropy measures of the SSA components having significant differences () between the eye-closed state and eye-open state were selected as the input features for pattern classifier. Table 5 presents the performance evaluations for the proposed entropy-based pattern learning to identify the eye-closed and eye-open physiological states from EEG signals. The eye-closed state is specified as positive instance and the eye-open state is specified as negative instance.

In Table 5, since the ten SSA components are obtained from EEG signals by the use of SSA with window length , the dimensions of ApEn, SampEn, FuzzyEn, and MSEn are all 10. However, according to the experimental results of paired-sample -test in training phase, for the ten SSA components, the dimension of the input features of ApEn for pattern classifier is just 9. Similarly, the dimensions of the input features of SampEn, FuzzyEn, and MSEn for pattern classifier are 8, 8, and 9, respectively. As shown in Table 5, it can be observed that with respect to the four well-established entropy measures, the proposed entropy-based pattern learning can achieve excellent performances to identify the eye-closed and eye-open physiological states from EEG signals. Specifically, for the proposed entropy-based pattern learning based on ApEn, SampEn, FuzzyEn, and MSEn, the identification accuracies of the eye-closed and eye-open physiological states are 95.50%, 94.50%, 96.50%, and 92.00%, respectively. It should be noted that the optimum accuracy of 96.50% is achieved based on FuzzyEn.

4.2. Physical Actions Classification from EMG Signals

To show its applicability, the proposed entropy-based pattern learning was used to classify the normal and aggressive physical actions from EMG signals. Moreover, in order to illustrate the superiority of the proposed entropy-based pattern learning, we made a comparative study with the existing research results reported by Pham [14]. For the EMG signals in our experiment, the first 5000 data points of the first channel (corresponding to the right bicep EMG signals) were used to ensure fair and valid comparisons, which were the same EMG signals used in Pham’s study [14]. Firstly, the EMG signals were first represented as the twenty SSA components by the use of SSA with window length . Then, ApEn, SampEn, MSEn, and FuzzyEn were extracted from the twenty SSA components as the features to quantify the EMG signals under normal and aggressive physical actions. The calculation parameters for ApEn, SampEn, MSEn, and FuzzyEn were also shown in Table 2.

Figure 3 shows the statistical results of the values of ApEn, SampEn, MSEn, and FuzzyEn for the first to twentieth SSA components of EMG signals under aggressive and the normal physical actions, which are given as means ± standard errors. The order of the SSA components from 1st to 20th in Figure 3 denotes the first to twentieth SSA components of EMG signals, which is corresponding to the descending order of magnitude of the twenty eignvalues according to the results of the SSA of EMG signals. For each of the SSA components of EMG signals, the paired-sample -test was used to test the significance of the entropy measures between the aggressive and normal physical actions. The values of paired-sample -test for ApEn, SampEn, MSEn, and FuzzyEn of the first to twentieth SSA components between the aggressive and normal physical actions are shown in Table 6.

In Table 6, the smaller the values are, the more significant are the entropy measures of the SSA components between the aggressive and normal physical actions. According to the ascending order of the values of paired-sample -test, for ApEn, the first to twentieth SSA components of EMG signals can be ranked as 4, 3, 5, 7, 10, 1, 2, 6, 20, 11, 12, 14, 18, 16, 13, 17, 19, 8, 15, and 9, respectively. The values of most SSA components of ApEn are greater than 0.05, except for the 1st, 2nd, 3rd, 4th, 6th, 7th, 8th, and 18th SSA components. Similarly, for SampEn, the first to twentieth SSA components of EMG signals can be ranked as 6, 3, 9, 11, 16, 1, 2, 7, 17, 18, 20, 19, 10, 15, 13, 14, 12, 4, 8, and 5, respectively. The values of most SSA components of SampEn are also greater than 0.05, except for the 1st, 2nd, 3rd, 6th, 7th, 8th, 18th, 19th, and 20th SSA components. For FuzzyEn, the first to twentieth SSA components of EMG signals are ranked as 3, 5, 9, 6, 8, 1, 2, 4, 19, 7, 10, 12, 15, 16, 14, 17, 11, 20, 13, and 18, respectively. The p values of the twenty SSA components of FuzzyEn are all less than 0.05. For MSEn, the first to twentieth SSA components of EMG signals are ranked as 10, 4, 3, 6, 9, 13, 20, 16, 18, 19, 17, 15, 14, 8, 11, 12, 5, 1, 2, and 7, respectively. The values of most SSA components of MSEn are greater than 0.05, except for the 2nd, 3rd, 4th, 17th, 18th, 19th, and 20th SSA components.

Next, we respectively used SVM and LDA as pattern classifiers and LOOCV as cross-validation strategy to evaluate the accuracy for the identification of aggressive and normal physical actions from EMG signals. It should be noted that the reason for the LDA and LOOCV used in our experiment was that we used the same pattern classifier and cross-validation strategy as the Pham’s study [14], for the purpose of ensuring the fair comparison. Like the entropy-based pattern learning in EEG signals, we took advantage of paired-sample -test for feature dimension reduction in the training phase. Table 7 presents the performance evaluations for the proposed entropy-based pattern learning to identify the aggressive and normal physical actions from EMG signals.

In Table 7, because the twenty SSA components are obtained from EMG signals by the use of SSA with window length , the dimensions of ApEn, SampEn, FuzzyEn, and MSEn are all 20. In addition, according to the experimental results of paired-sample -test in the training phase, for the first to twentieth SSA components of EMG signals, all the SSA components of FuzzyEn between the aggressive physical action and normal physical action were significant (). Therefore, the dimension of the input features of FuzzyEn for pattern classifier is 20. However, for some SSA components of ApEn, SampEn, and MSEn, there are not significant differences () between the aggressive physical action and normal physical action. Finally, the dimensions of the input features of ApEn, SampEn, and MSEn for the proposed entropy-based pattern learning are just 8, 9, and 7, respectively.

As shown in Table 7, for the proposed entropy-based pattern learning based on ApEn, SampEn, FuzzyEn, and MSEn, when used SVM as pattern classifier, the identification accuracies of the aggressive and normal physical actions are 77.50%, 80.00%, 87.50%, and 80.00%, respectively; when used LDA as pattern classifier, the identification accuracies are 72.50%, 75.00%, 68.78%, and 73.75%, respectively. However, for Pham’s study [14], the identification accuracies of the aggressive and normal physical actions were 61.25% and 50.00%, which were corresponding to the LDA with time-shift multiscale entropy (TSME) as features and the LDA with MSEn as features, respectively. Hence, according to the results in Table 7, one can clearly observe that our method is outperformed than that reported in Pham’s study. In addition, the contrasting results are fully demonstrated that the proposed entropy-based pattern learning has strong discriminating power to reveal specific biosignal patterns from EMG signals.

4.3. Heart States Detection from RR-Intervals Signals

To demonstrate the widely application potential of the proposed entropy-based pattern learning for assessment of physiological signals, we further used the proposed entropy-based pattern learning to identify the NSR-heart and CHF-heart states from RR-intervals signals. Moreover, we also made a comparison with the recent research reported by Liu et al. [57]. In our experiment, the RR-interval signals were first represented as the twelve SSA components by the use of SSA with window length . Then, ApEn, SampEn, MSEn, and FuzzyEn were extracted from the twelve SSA components to quantify the RR-interval signals under two different states of heartbeat (NSR-heart state and CHF-heart state). The calculation parameters for ApEn, SampEn, MSEn, and FuzzyEn were shown in Table 2. It should be noted that the first one hour of the RR-interval signals was used in our experiment, and the rhythm of each subject’s heartbeat was different. So, the lengths of data points for the first one hour of the RR-interval signals were not fixed. The means and standard deviations of the length of data points for the NSR-heart and CHF-heart states were and 5266 ± 623, respectively.

Figure 4 shows the statistical results of the values of ApEn, SampEn, MSEn, and FuzzyEn for the first to twelfth SSA components of RR-intervals signals under the NSR-heart and CHF-heart states, which are given as means ± standard errors. The order of SSA components from 1st to 12th in Figure 4 denotes the first to twelfth SSA components of RR-intervals signals by the SSA method, which is corresponding to the descending order of magnitude for the twelve eignvalues according to the results of the SSA of RR-intervals signals. For each of the SSA components, the effect size Cohen’s of the entropy measures of SSA components between the NSR-heart and CHF-heart states was calculated so as to rank the SSA components. The values of Cohen’s for ApEn, SampEn, MSEn, and FuzzyEn of the first to twelfth SSA components for RR-interval signals between the NSR-heart and the CHF-heart states are shown in Table 8.

In Table 8, the larger the values of Cohen’s are, the more significant are the entropy measures of the SSA components between the NSR-heart and CHF-heart states. According to the descending order of the values of Cohen’s , for ApEn, the first to twelfth SSA components of RR-interval signals are ranked as 10, 6, 9, 8, 7, 11, 5, 12, 3, 4, 1, and 2, respectively. The values of Cohen’s for the 9th, 10th, 11th, and 12th SSA components are larger than 0.40. In the same way, for SampEn, the first to twelfth SSA components of RR-interval signals are ranked as 12, 5, 10, 11, 7, 8, 6, 9, 1, 4, 2, and 3, respectively. The values of Cohen’s for the 2nd, 5th, 7th and 9th, to 12th SSA components are larger than 0.40. For FuzzyEn, the twelve SSA components from the 1st to 12th order are ranked as 4, 1, 3, 12, 7, 8, 9, 11, 2, 5, 6, and 10, respectively. The values of Cohen’s for the 1st, 2nd, 3rd, and 9th SSA components are larger than 0.40. For MSEn, the twelve SSA components from the 1st to 12th order are ranked as 12, 6, 10, 11, 3, 9, 8, 4, 1, 7, 5, and 2, respectively. The values of most SSA components of MSEn for the 2nd, 5th, and 7th–12th SSA components are larger than 0.40.

Next, we used SVM as pattern classifier and FFCV as cross-validation strategy to evaluate the performance for the proposed entropy-based pattern learning with the aim to identify the NSR-heart and CHF-heart states from RR-interval signals. First of all, we also need to screen the notable SSA components in the training phase. Taking advantage of the effect size of Cohen’s to reduce the feature dimension, the magnitude of is used to indicate the notable SSA components. Table 9 shows the performance evaluations for the proposed entropy-based pattern learning to identify the NSR-heart and CHF-heart states from RR-interval signals.

In Table 9, since the twelve SSA components are obtained from RR-interval signals by the use of SSA with window length , the dimensions of ApEn, SampEn, FuzzyEn, and MSEn are all 12. Nevertheless, according to the experimental results of the values of Cohen’s in the training phase, the dimensions of the input features of ApEn, SampEn, FuzzyEn, and MSEn for pattern classifier are 4, 7, 4, and 8, respectively. As shown in Table 9, for the proposed entropy-based pattern learning based on ApEn, SampEn, FuzzyEn, and MSEn, the identification accuracies of the NSR-heart and CHF-heart states are 73.49%, 79.52%, 86.75%, and 81.93%, respectively. It should be noted that for the proposed entropy-based pattern learning to identify the NSR-heart and CHF-heart states from RR-interval signals, the optimum identification accuracy of 86.75% is achieved based on FuzzyEn. In Liu’s study, for the traditional MSEn analysis of the original RR-intervals signals (named as MSE_RR), the identification accuracy of five-fold cross-validation was 74.6% when the length of data points was ; for the MSEn analysis of the differential RR interval signals (named as MSE_dRR), the identification accuracy of five-fold cross-validation was 85.6% when the length of data points was . Hence, the identification accuracy for the proposed entropy-based pattern learning based on MSEn is 7.33% higher than the identification accuracy of the MSE_RR in Liu’s study, and the identification accuracy for the proposed entropy-based pattern learning based on FuzzyEn is better than the identification accuracies of the MSE_RR and MSE_dRR in Liu’s study.

5. Discussion

In this paper, we make an innovation attempt to develop a new methodology of entropy-based pattern learning based on SSA components and entropy measures for assessment of physiological signals. For the proposed entropy-based pattern learning, a new feature engineering approach based on SSA method to decompose physiological signals is applied to the feature extraction of entropy measures. Physiological signals are first represented as a series of SSA components and then extract entropy measures from the resulting SSA components. The resulting SSA components are data adaptive, which are closely related to the temporal structure of physiological signals themselves [46]. The main benefit for the proposed entropy-based pattern learning is that the SSA components can help to facilitate the feature extraction of entropy measures from physiological signals. In our work, four well-established entropy measures, ApEn, SampEn, FuzzyEn, and MSEn, are calculated with the SSA components as the features of physiological signals. The proposed entropy-based pattern learning can achieve effective patterns learning for assessment of physiological signals. Our experiments demonstrate that the proposed entropy-based pattern learning is capable of revealing specific biosignal patterns in EEG, EMG, RR-interval signals. Why the proposed entropy-based pattern learning in physiological signals with the representation of SSA components can enhance biosignal patterns recognition is worth paying attention to.

A possible explanation is that from the perspective of signal decomposition, the physiological signals with the representation of SSA components can be extracted more informative features than the physiological signals themselves so as to improve the discrimination of biosignal patterns from physiological signals. In order to demonstrate the difference of the discriminating power of entropy measures between the SSA components and the physiological signals themselves, we further studied the entropy-based pattern learning in the physiological signals themselves without the representation of SSA components to identify specific patterns in EEG, EMG, and RR-interval signals, and made a comparison with the experimental results of the physiological signals with the representation of SSA components. In this experiment, we concentrated on the FuzzyEn as the entropy measure to perform experimental analysis. Table 10 presents the comparison results of the identification accuracies for the entropy-based pattern learning to identify specific patterns in EEG, EMG, and RR-interval signals with and without SSA components representation. The pattern learning task #1, #2, and #3 refer to the eye-closed and eye-open states identification from EEG signals, the aggressive and normal physical actions classification from EMG signals, and the NSR-heart and CHF-heart states detection from RR-intervals signals, respectively, which are just the pattern learning tasks in the previous section. For the three pattern learning tasks, the pattern classifiers and cross-validation strategy are all based on the SVM and LOOCV.

According to the experimental results of the previous section, for the pattern learning task #1 and #2, the optimum identification accuracies for the proposed entropy-based pattern learning based on FuzzyEn are 96.50%, 87.50%, respectively. For the pattern learning task #3, in order to make a better comparison, we used the LOOCV (an exhaustive cross-validation) as validation strategy instead of the FFCV. For the proposed entropy-based pattern learning, the optimum identification accuracy of the pattern recognition task #3 is achieved to 83.13%. However, as shown in Table 10, when the EEG, EMG, and RR-interval signals without SSA components representation, the identification accuracies for the pattern learning task #1, #2, and #3 are 71.50%, 80.00%, and 65.06%, respectively, which are obviously lower than the former experimental results. Therefore, according to the comparative experimental results, one can clearly observe that the discriminating power of entropy measures based on the SSA components is much stronger than the physiological signals themselves. It should be noted that for the entropy-based pattern learning in physiological signals themselves, the dimension of entropy measures is 1 in Table 10 because just the physiological signals themselves are used to calculate the entropy metric in the experiment.

When physiological signals are represented as the SSA components, the parameter of the window length of SSA is very critical to the performance of the proposed entropy-based pattern learning [58]. In order to evaluate the influence of the parameter of window length on the performance of the proposed entropy-based pattern learning, we used the proposed entropy-based pattern learning with different window lengths to investigate the eye-closed and eye-open states identification from EEG signals and the aggressive and normal physical actions classification from EMG signals, respectively. Table 11 presents the comparison results of the identification accuracies for the proposed entropy-based pattern learning to identify the eye-closed and eye-open states from EEG signals by the use of different window lengths, and to classify the aggressive and normal physical actions from EMG signals by the use of different window lengths.

As shown in Table 11, for the identification of eye-closed and eye-open states from EEG signals, when the window length is selected as 6, 10, 14, 16, 20, and 30, the identification accuracies for the proposed entropy-based pattern are 90.00%, 96.50%, 93.50%, 88.00%, 88.00%, and 90.00%, respectively. The best accuracy of 96.50% is achieved by the proposed entropy-based pattern learning with window length . For the identification of the aggressive and normal physical actions from EMG signals, when the window length is selected as 5, 10, 15, 20, 25, and 30, the identification accuracies for the proposed entropy-based pattern are 72.50%, 81.25%, 75.00%, 87.50%, 70.00%, and 75.00%, respectively. The best accuracy of 87.50% is achieved by the proposed entropy-based pattern learning with window length . According to the experimental results, in order to achieve the optimum performance for the proposed entropy-based pattern learning, the window length should be selected as an appropriate value to obtain the enough essential components derived from the physiological signals. In addition, according to the experimental results in Table 11, the largest number of window length does not obtain the optimum identification accuracy for the proposed entropy-based pattern learning. Therefore, the parameter of window length of SSA cannot be chosen too large, which has an adverse effect on the performance for the proposed entropy-based pattern learning. This may be due to the fact that if the window length is too large, it may cause some essential components to mix with each other [46, 58].

Finally, as shown in Table 11, the optimum choices of the window length maybe vary with different types of physiological signals for the proposed entropy-based pattern learning. For the EMG signals, the best choice of window length is . However, for the EEG signals the best choice of the window length is . The main reason is probably that when performed the SSA decomposition, selection of the appropriate window length depends on the problem in hand [46, 58]. The SSA method is more an exploratory, model building tool than a confirmatory procedure [46, 58]. In the practical application, how to determine the parameter of window length of SSA is an important issue worthy of in-depth study. To the best of our knowledge, in the general case, no universal rules and unambiguous recommendations can be given for the choice of the window length. In our work, for the specific pattern learning tasks, the parameter of the window length of SSA is closely related to the dimension of the input features of pattern classifier (the number of SSA components). The value of window length w is regarded as a hyperparameter of machine learning method (pattern classifier) and is determined based on grid search, which is a common practice for manual hyperparameter search [59].

There are some advantages and disadvantages in our current study. For the proposed entropy-based pattern learning, the main advantages are as follows. Firstly, employing the SSA components to facilitate the feature extraction of entropy measures from physiological signals has strong power to uncover biosignal patterns of physiological signals. The SSA components decomposed by the SSA method are data adaptive, which can be viewed as the temporal structure of physiological signals [46]. Secondly, the proposed novel entropy-based pattern learning has very good versatility and flexibility in conjunction of different machine learning algorithms and different entropy measures, such as bubble entropy [60], permutation entropy [61], short-term Shannon entropy and Rényi entropy [21, 62, 63], etc., to identify specific biosignal patterns in EEG, EMG and other physiological signals. Thirdly, the proposed entropy-based pattern learning based on SSA components is very suitable for the dynamic analysis of multi-componential, nonstationary and nonlinear physiological signals, because the SSA components are directly obtained by the SSA method which can be applied to arbitrary signals including nonstationary signals [45], especially useful for analyzing and forecasting physiological signals with complex nonstationarity. However, for the proposed entropy-based pattern learning, there are some disadvantages as below. For one thing, to represent the physiological signals with SSA components, the parameter of the window length of SSA should be carefully determined. As has already been mentioned, there are no solid methods to automatically determine the window length. For another, to achieve the optimum performance for the proposed entropy-based pattern learning, screening the notable components is indispensable step for the purpose of selecting the appropriate entropy measures from the SSA components. The dimension of the entropy measures for the proposed entropy-based pattern learning should be carefully determined. Finally, it is undeniable that the entropy measures are very computing-intensive, especially when the physiological signals are represented as the SSA components. The computation cost of the proposed entropy-based pattern learning is proportional to the number of the SSA components in practical applications.

6. Conclusions

In this paper, an innovative entropy-based pattern learning employing the SSA components to extract entropy measures has been put forward for assessment of physiological signals. The main innovation of the proposed entropy-based pattern learning is based on a new feature engineering approach by the use of a combination strategy of SSA and entropy measures in the interest of facilitating the feature extraction of entropy measures from physiological signals, which is different from the available approaches of entropy-based pattern learning to deal with physiological signals. To demonstrate the validity, applicability and versatility of the proposed entropy-based pattern learning, three kinds of physiological signals, EEG, EMG, and RR-interval signals, are used for experiment evaluations. Two classical machine learning algorithms, SVM and LDA, are used as pattern classifiers to identify the eye-closed and eye-open states from EEG signals, to classify the aggressive and normal physical actions from EMG signals, and to detect the NSR-heart and CHF-heart states from RR-interval signals, respectively. Experiment results show that the excellent performances for the proposed entropy-based pattern learning have been achieved. The optimum identification accuracies are reached to 96.50%, 87.50%, and 86.75%, corresponding to the eye-closed and eye-open states identification from EEG signals, the aggressive and normal physical actions classification from EMG signals, and the NSR-heart and CHF-heart states detection from RR-interval signals, respectively. The experiment results also have been made some comparisons with the available studies reported recently, which fully demonstrate that the proposed entropy-based pattern learning can achieved better performances for medical assessment of physiological signals. In addition, according to the comparison of experimental results based on the physiological signals themselves and the SSA components, it is concluded that for the proposed entropy-based pattern learning the identification accuracies based on the SSA components is much higher than those based on the physiological signals themselves. Despite of some drawbacks such as the window length of SSA should be carefully determined, we believe that the proposed entropy-based pattern learning is a promising avenue in the analysis and forecasting of physiological signals, neurological disease diagnosis, artificial intelligence, and decision-making system and so on.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by the Shenzhen Governmental Basic Research Grants (JCYJ20170412151226061 and JCYJ20180507182241622).