Abstract

Nonintrusive load identification for industrial users can accurately acquire the operation of each load. However, it is a major challenge in the demand-side response due to the hardship of collection data for modelling, and high precision measuring equipment is required. Aiming at this situation, a nonintrusive load identification method is proposed, combining the least square QR (LSQR) with the sequential leader clustering algorithm. Firstly, regarding accurate depiction of industrial loads, some appropriate load feature indices of steady-state and transient processes are extracted, respectively. For steady-state processes, the active power, the reactive power, and the root mean square (RMS) current value are selected as the feature indices. In the case of transient processes, ten feature indices of three stages are employed: before, during, and after transient events, consisting of the duration of transient events, the RMS current value before and after transient events, the average value of active power before and after transient events, the average value of reactive power before and after transient events, the maximum RMS current value of transient events, etc. On this base, the LSQR algorithm is proposed to decompose unknown composite power to access the operation of various loads at steady-state. The sequential leader clustering algorithm is propounded to classify transient events of typical industrial loads and further identify which kind of loads had switched. Finally, to validate the effectiveness of the presented model, data of industrial loads from a concrete plant are collected, including blender, cement screw, sewage dump, and inclined belt conveyor, and simulation analysis is fulfilled. The results indicate that the model proposed can effectively achieve the nonintrusive industrial load identification, and least unified residue (LUR) is about 10−16, which is much better than the factorial hidden Markov model (FHMM) and the artificial neural network (ANN) model.

1. Introduction

With the conception of electric-load intelligent (E-LI) being put forward, the number of users increases so rapidly that load types are diversified and users’ requirements for electricity service are more and more detailed [1, 2]. On this basis, it is crucial to analyze the electric consumption behavior more precisely and study the key technologies related to load monitoring, e.g., load feature extraction [3, 4], load identification [5], load forecasting [6], and load monitoring [7]. Therefore, research on these key technologies can improve the efficiency of collection-transmission and enhance the accuracy of monitoring and identification for load equipment. Currently, there are mainly two load monitoring approaches: intrusive load monitoring (ILM) and nonintrusive load monitoring (NILM). ILM is built upon low-end meter devices attached to home appliances in opposition to NILM techniques, and a single sensing point of each electrical device is needed in the ILM [8, 9]. Although ILM can provide higher efficiency and reliability than NILM, it is gradually replaced by the latter. NILM can perform real-time monitoring and analysis of demand-side load. It has many advantages, including low cost, simple communication, easy maintenance, and promotion. Therefore, load identification methods for NILM become a research hotspot of demand-side load monitoring.

Presently, some research results probe into deep learning (DL) intelligent algorithms for NILM in the steady-state process, such as hidden Markov model [10, 11], artificial neural network (ANN) model [12, 13], and graph signal processing developed approach [14, 15]. In [10, 11], the factorial hidden Markov model (FHMM) is considered, and it is indicated to outperform the other unsupervised disaggregation methods. In [12, 13], the DL model on ANN algorithm is also investigated and has high effectiveness and accuracy in analysis of power consumption behavior for load monitoring. In [14, 15], the graph signal processing developed approach is designed for NILM. The competitive performance of the GSP-based approach is demonstrated with respect to the traditional hidden Markov model and decision tree approaches. Nevertheless, the above research outputs are mainly focused on residential electricity consumption, and there are few achievements on industrial electricity consumption.

Moreover, several methods concentrate on the transient process with a view to further monitor the load switching status, including data acquisition, transient feature extraction, and load identification. Some research evaluates load signatures established by voltage-current (V-I) trajectory [16, 17], and some papers propose signal processing techniques to extract features, such as wavelet transform [18] and Stockwell transform (S-transform) [19, 20]. Some novel algorithms are applied to transient energy analysis to improve recognition accuracy and computational speed. For instance, a stealthy black-box attack construction model is suggested in [21], and ANN model combining with turn-on transient energy analysis is utilized in [22]. Most of load disaggregation methods are designed to focus on the load on/off events of single appliance; however, a current investigation in [23] can effectively deal with the simultaneous on/off events of multiple appliances by applying a cepstrum-smoothing-based load disaggregation method.

The aforesaid research results are mainly aimed at household loads, and the experimental data mostly use open REDD data sets. There are few studies on the load decomposition and load identification for industrial users, and testing data of industrial loads are limited. In [24], it presents a new nonintrusive energy monitoring method for industrial microgrid; however, it is not suitable for demand-side monitoring for industrial users. Moreover, load feature extraction and load identification are only performed for steady-state or transient processes, and there is no overall process to cover steady-state and transient processes.

Given the above, a new method of load identification for industrial users is suggested, integrating with the analysis of steady-state and transient load feature of industrial users. The active power, the reactive power, and the root mean square (RMS) current value are selected as the feature indices of steady-state processes. The least square QR (LSQR) algorithm is utilized to decompose the loads in the steady-state process. In addition, for transient processes, ten feature indices of three stages are employed: before, during, and after transient events, consisting of the duration of transient events, the RMS current value before and after transient events, the average value of active power before and after transient events, the average value of reactive power before and after transient events, the maximum RMS current value of transient events, the maximum active power during transient, and the maximum reactive power during transient. The sequential leader clustering analysis is applied to classify the transient events of typical industrial loads and further identify transient events.

1.1. The Main Contributions

(i)According to the characteristics of industrial users, the steady-state and transient feature indices of industrial load are extracted, and the steady-state load feature space and transient load feature space are established.(ii)The load decomposition-identification model for industrial scenarios combining the LSQR with the sequential leader clustering algorithm is proposed. The operation of each load is obtained by using the LSQR algorithm model, in combination with the sequential leader clustering algorithm to further identify transient events.(iii)Four kinds of loads, including blender, cement screw, sewage pump, and inclined belt conveyor, are selected to test the load decomposition-identification model. The industrial load dataset has collected 30 hours of bus-side three-phase current and voltage signals from a concrete plant by DL850 oscillograph recorder.(iv)The results demonstrate that the proposed load decomposition-identification model gives better performance for the nonintrusive industrial load identification compared to the FHMM and ANN model.

The remainder of this paper is organized as follows. Section 2 provides a brief analysis of industrial loads and summarizes the feature indices of steady-state and transient processes. Section 3 describes the framework of the envisaged model, followed by the details of the three components: data processing of raw sampling, load decomposition of the steady-state process rooted on the LSQR algorithm, and the transient process load identification built on the clustering algorithm. Section 4 delineates the data sources and discusses the performance of the proposed model, and a validation of the whole approach is given. Section 5 concludes the paper and envisions the future work.

2. Characteristic Analysis of Industrial Users

The types of industrial users are so diverse that the traits of different types have certain distinct. In accordance with the electricity price classification, the industrial users can be categorized into following, namely, commercial mixed electricity, office-used, agricultural-used, civil-used, nongeneral industrial users, and large industrial users. The electricity demand of industrial users can be characterized as a sequence fluctuating with time. Modeling and analysis of this sequence can achieve its inherent traits [25, 26]. The equipment-side of a concrete plant is selected as the monitoring target in this paper. And, at 10 kHz sampling frequency, the DL850 oscillograph recorder is operated to record the current wave information of multiple typical loads on the equipment-side, including blender, cement screw, sewage pump, and inclined belt conveyor. For instance, Figure 1(a) shows the current change after a cement-screw-on event, while Figure 1(b) shows the current change after a cement-screw-off event.

From Figure 1, it shows that the load power consumption of industrial users generally has the following traits:(i)Current wave shows obvious fluctuation with time(ii)Load on/off events cause significant changes in instantaneous current(iii)Current waveform tends to be stable after load on/off events

In the action of load switching, the transient event of an industrial load consists of three stages: before, during, and after transient. The stage before transient refers to the original steady state before a load switches, and the stage after transient stands for another stable state after a load switches. And, some features, such as current and power, will change dramatically during the transient process. Therefore, it is necessary to extract complete features of steady-state and transient processes for load identification.

2.1. Steady-State Features Extraction

The active power and the reactive power are regularly opted as feature indices for the steady-state process:where P and Q symbolize the active power and the reactive power, respectively. V and I are the RMS values of voltage and current, respectively. ϕ means the phase difference between voltage and current, and cosϕ is defined as the power factor.

Figure 1 exhibits that the load signals of industrial users are periodic when the frequency of the modulation is a harmonic of the fundamental frequency. In order to characterize the industrial loads more comprehensively, the RMS value of current in the steady-state process is also extracted as a load feature index:where N is the number of sampling points in one fundamental period, n is the number of sampling points, and i(n) is the instantaneous current value of each sampling point.

Therefore, as shown in Table 1, the active power, the reactive power, and the RMS current value are extracted to construct the steady-state process feature space. RMS is a feature that is very sensitive to the window length; hence, here, we select the composite-window method based on the cumulative sum (CUSUM) algorithm to calculate three steady-state feature indices. The composite-window method is applicable for the feature indices of both steady-state and transient-state [26].

2.2. Transient Features Extraction

Detection of transient events is a central issue to realize load identification for the transient process. Waveforms between the original steady-state and the new steady-state can be captured to extract the transient features of different loads. In order to outline the transient process more completely, features of three stages delineated above are selected to construct a comprehensive feature space. As can be seen in Table 2, there are altogether 10 feature indices.

In the case of load-on events, the first stage in Table 2 represents the steady-state before the occurrence of load-on events. The second stage represents the process of load-on events. The third stage indicates the new steady-state after load-on events. It is similar in another case, except that event types are load-off events instead of load-on events.

In this paper, transient feature indices are extracted according to the composite-window based on CUSUM [26]. The window length is very important for transient events detection. If the window length is too short, the steady-state fluctuations and glitch of the current easily lead to false detection of the events. However, if the window length is too wide, transient events cannot be accurately identified. Due to the wide duration of industrial equipment start-up transient process, we select the maximum RMS current value during transient as one of the feature indices. This composite-window method can verify the validation of event detection and distinguish transient events due to quality problems, gaps, and impulses, etc., from the connection or disconnection of industrial loads.

The average value of active power before or after transient relies on the active power in each period, the same goes for the average value of reactive power. Take the pretransient process as an example:where T expresses the numbers of steady-state waveform periods before the transient process. Pn and Qn (n = 1, 2, …, T) are active power and reactive power in the n-th period, respectively.

3. Load Identification Method Depending on the Steady-State Process and the Transient-State Process

To obtain an integrated load identification for nonintrusive industrial users, this paper focuses on the load decomposition of the steady-state process combining with identification of the transient process. The designed identification algorithm mainly includes two parts, as can be explicated in Figure 2. Firstly, the steady-state load decomposition can be described by the least squares problem which can be optimized by the LSQR algorithm. Secondly, the sequential leader clustering algorithm is introduced to classify the corresponding load on/off events in the transient process, in such a way as to accurately identify which load or type of loads have a switch action.

3.1. Data Processing of Raw Sampling

Random errors are prone to occur when waveforms are recorded due to the complex industrial environment, regardless of the equipment-side or bus-side. Thus, invalid data need to be eliminated and corrected. In order to record the waveforms of collecting data more objectively and comprehensively, the data of voltage and current are usually sampled at high frequency. However, it will cause a sharp increase in data volume and decrease the data processing efficiency to a certain extent. As well, data of different load features are generally not in a unified dimension; accordingly, it makes normalization more essential. In short, it is necessary to adjust and process the original sampling data, including data cleaning, downsampling, and normalization. Herein, the Irada criterion is applied for data cleaning to eliminate the random error data, and the Nyquist sampling theorem is employed for downsampling to obtain the balance between efficiency and accuracy of sampling. The min-max normalization and Z-score normalization methods are both utilized to normalize parameters.

3.1.1. Irada Criterion

Assume that s1, s2, …, sn are original sampling data, respectively, and  = si-s (i = 1, 2, ..., n) symbolizes residual error, where s is the arithmetic mean. σ denotes the standard error in terms of Bessel’s formula. satisfies the following formula:where si is considered as a gross error value that should be corrected. Here, we use the average value of adjacent sampling points to replace the invalid data for correction.

3.1.2. Nyquist Sampling Theorem

Nyquist sampling theorem depicts the relationship between the sampling rate and the signal frequency. It is ruled that the sampling rate fs must be greater than twice the highest frequency of sampled signal, which is usually called the Nyquist frequency fN:

3.1.3. Min-Max Normalization

The min-max normalization can transform the original data into the range of [0 1] by the linearization method. Assume that sij is the value of the j-th feature of ith load. And, min(sj) represents the minimum value of the j-th feature among all the power consumption. Similarly, max(sj) indicates the maximum value of the j-th feature among all the power consumption.where Z(sij) denotes the normalized value of sij.

3.1.4. Z-Score Standard Normalization

Z-score standard normalization is usually used to standardize the original data, and it can eliminate the influence of different variable units on the results:where is the arithmetic mean of all power consumption features of the ith load, S(sij) is the sample standard deviation, and n is the number of the feature indices. Z(sij) has the same meaning as formula (6).

3.2. Load Decomposition of the Steady-State Process Relied on LSQR Algorithm
3.2.1. Problem Description

The LSQR algorithm solution consists of two steps [27, 28], as illustrated in Figure 3. The first is performing Golub–Kahan iterative bidiagonalization to simplify the calculation. The second is to solve the least-squares problem after bidiagonalization.

Given a known load feature database and a composite power feature database with unknown constituent elements in a certain scene, an unknown power can be decomposed into several identifiable loads which belong to the known load feature database. So, the load decomposition of the steady-state process for NILM can be described aswhere means the load feature matrix of R kinds of industrial users and expresses the load feature vector of the ith load with M feature factors. is the feature vector for a composite power expressed by the same M features. represents the weight vector of the loads, where xi designates the operation status of ith load. is the minimum second-order residual norm, which is the fitness function between the estimated composite power and the actual data.

In general, the feature matrix is nonpositive definite. Therefore, the LSQR algorithm is adopted to approximately solve the linear least squares problem as formula (8), which is equivalent to iteratively solving the regular equation .

3.2.2. Golub–Kahan Bidiagonalization

By singular value decomposition (SVD) method, the feature matrix H can be constructed as a bidiagonal form:where and are both unitary matrices, which can be determined by the iterative method. P is the number of iterations. (k = 1, 2, …, P) is the element on the main diagonal of the bidiagonal matrix B, and is the element on the diagonal below the main diagonal.

In both sides of (9), multiply U on the left to get

or can be deduced by (10). On the other hand, apply the Hermit-transform to both sides of (11) to get . Then, multiply V on both sides by the left to get :

Similarly, or (β1 = 0, k = 1, 2, …, P) can be calculated.

The bidiagonalization depending on the Golub–Kahan method transforms the load feature matrix H into sparse bidiagonal matrix B, and the process can be summarized in Table 3.

3.2.3. Solving the Least Squares Problem

Based on the bidiagonalization of P iterations, the least squares problem of load decomposition designated in formula (8) can be transformed into the following equation:where BP is a (P+1)P bidiagonal matrix. denotes the solution of the least squares problem of sparse bidiagonal matrix, which can be attained by QR decomposition of (13). Since V is the unitary matrix, the approximate estimated solution of (8) can be obtained by inwhere is the estimated result, which signifies the component weight of each known load after load decomposition.

3.3. Load Identification for the Transient Process Underlying on the Sequential Leader Clustering Algorithm

The sequential leader clustering algorithm [29, 30] is approved to detect unknown transient events to realize load identification of the transient process. According to the load feature space established in Table 2, the main process of the load identification based on the clustering algorithm is as follows:Step 1: when an unknown transient event occurs, the corresponding load feature indices are extracted and the feature vector of the event is constructed as Fnew = [T, If, Ib, Imax, Pf, Qf, Pb, Qb, Pmax, Qmax]. Each component is a dimensionless value normalized by Z-score standard normalization.Step 2: take the load feature vectors of the transient events corresponding to known load operations as cluster centres. So, the cluster set can be designated as C= [C1,C2,C3,…,CG], where G is the number of known transient events and Ci (i = 1, 2, …, G) refers to the ith cluster centre. The data dimensions and vector elements of each cluster centre in C are the same as Fnew. Then, calculate the Manhattan distances between Fnew and each cluster centre in C, respectively, which can be depicted as D = [d1,d2,d3,…,dG], i.e. di = |Fnew-Ci|.Step 3: find the minimum distance dmin in D and select the appropriate threshold β. If dmin<β, then Fnew belongs to the cluster centre corresponding to dmin. Otherwise, Fnew should be added to set C as a new cluster centre.Step 4: when a new transient event occurs, repeat Step 1 to Step 3.

The setting of distance threshold β depends on the reliability and accuracy of load monitoring equipment. Herein, the DL850 oscillograph recorder is operated for recording the transient waveform and related load characteristics of common appliances. Then, clustering is conducted to find the β parameter values with the best clustering effect.

4. Experimental Analysis and Results

4.1. Evaluation Index

The least unified residue (LUR) is selected to evaluate the accuracy of load decomposition. The LUR is a measure between an unknown composed load compared with the known database items, and it is defined as follows [3]:where refers the composite load feature vector estimated by the introduced algorithm and y is the feature vector of an unknown load-on bus-side as defined in formula (8).

4.2. Simulation Results

In this paper, the NILM method is applied in the industrial application scenario [31]. Four kinds of industrial loads are investigated, including blender, cement screw, sewage pump, and inclined belt conveyor, as shown in Table 4. The industrial load dataset has collected 30 hours of bus-side three-phase current and voltage signals from a concrete plant by DL850 oscillograph recorder. Considering the influence of transformers, on the bus-side, the current transformer (CT) ratio is 100 : 1, the potential transformer (PT) ratio is 1 : 1, and the sampling frequency is 10 kHz. The proposed load decomposition-identification model is performed to this application scenarios.

For three-phase power, the amplitude of voltage or current is the same, but the phase of the waveform differs from each other. The phase of voltage or current is not a determining issue for load decomposition. Hence, data preprocessing is carried out for only one phase.

4.2.1. Data Preprocessing

In this paper, random error data are eliminated by using the Irada criterion to get relatively smooth waveform. Then, according to the Nyquist sampling theorem, the data are sampled at 1 kHz frequency, instead of the original 10 kHz, to balance the efficiency and accuracy of sampling. Figure 4 shows the comparison between the original current waveform and the downsampling of sewage-pump-on event. As shown in Figure 4, the downsampling data do not affect the extraction of load feature, while reducing the data quantity to 1/10 of the original.

4.2.2. Decomposition for the Steady-State Process

In terms of the previous analysis, the RMS value of steady-state current, the active power, and the reactive power are selected as the feature indices. Therefore, the load feature matrix H, which is mentioned in formula (8), is constructed as shown in Table 5. The load feature matrix H here includes four kinds of loads, and each load is described by three feature indices, so matrix H here is a 3 4 matrix.

In order to verify the effectiveness of the proposed algorithm, four groups of unknown power on the bus-side are collected as test data, i.e., y1, y2, y3, and y4. Each test data is the feature vector y in formula (8), which is described with the same features as the load feature matrix H. With reference to the feature indices and extraction methods of the known loads, the corresponding features of each unknown power are extracted, respectively, as shown in Table 6.

Since the units of each feature index are not uniform and the load decomposition results must be nonnegative, the min-max normalization is chosen for dimensionless processing, as formula (6) delineated. Considering the test data y1, the proposed load decomposition algorithm is applied to the normalized feature matrix as shown in Table 7.

Figure 5 shows the influence of iterations of the LSQR algorithm on the accuracy of load decomposition. As the number of iterations increases, the accuracy of load decomposition also improves, while when the number of iterations reaches a certain level, the decomposition accuracy does not improve significantly. It can be seen that the optimal iteration number is 3 for y1 in Figure 5.

Each group of unknown powers are decomposed into a linear combination of four known loads to complete steady-state load decomposition. Based on the optimal iteration of the LSQR algorithm, the decomposition results of each compound power are shown in Table 8. For test data y2, when the number of iterations exceeds one, the decomposition weight appears negative. It does not meet the constraint conditions of (8); hence, the decomposition result of y2 is obtained by one iteration.

The decomposition weight of each known load is calculated by the proposed identification algorithm, and the weight of each load represents the proportion of the load in the composited power on the bus-side. For instance, the decomposition results of test data y1 are 0.249 (blender), 0.340 (cement screw), 5.445(cement screw), and 0.003 (inclined belt conveyor), which indicate the proportion of four loads. It is showed that the sewage pump accounts for the largest proportion. According to the decomposition weight of each load, the LUR of the predicted load combination and unknown power on bus-side are calculated to measure the accuracy of load decomposition.

In order to verify the effectiveness of the proposed algorithm, considering a single sample, the load identification models in [11] (FHMM) and in [12] (ANN) are selected to compare with the proposed model in terms of LUR and calculation time. The identification results of different models are shown in Table 9. It can be seen that although the proposed algorithm does not have advantages in calculation time compared with other novel machine learning models, it has obvious advantages in accuracy.

4.2.3. Identification for the Transient Process

10 feature indices are chosen in terms of the transient feature space established in Table 2, and the load types selected are consistent with steady-state load decomposition. The transient load identification underlying on sequential leader clustering algorithm firstly needs to construct the cluster centres, which signify transient events of typical loads determined by steady-state load decomposition. After a dimensionless processing on the basis of Z-score normalization, as formula (7) defined, the cluster centres can be constructed as Table 10. The eight cluster centres denote eight load on/off events, i.e., blender on/off (class 1/class 5), cement screw on/off (class 2/class 6), sewage pump on/off (class 3/class 7), and inclined belt conveyor on/off (class 4/class 8). According to the experiment, the best clustering effect threshold β is 0.25.

When unknown transient events occur on the bus-side, the related feature indices are extracted and normalized, and then they are classified by the clustering algorithm, so as to identify the specific load switching operation corresponding to the event.

There are four unknown test events (i.e., test 1, test 2, test 3, and test 4), and Figure 6 illustrates the comparison between unknown test events and cluster centre events on the feature of current RMS value. For example, Figure 6(a) shows the comparison of the RMS current value of unknown transient event test1 with the cluster centre event “sewage pump on.” The figure shows that the fitting degree, between the clustered identification results and cluster centre events, is basically consistent. Therefore, the effectiveness of the method presented is further verified.

5. Conclusions

This paper proposes a load decomposition-identification method for industrial users. In order to analyze the transient process caused by load switching operations and the steady-state process after load-on events, the steady-state feature space and the transient feature space of industrial loads are constructed, respectively. LSQR algorithm is proposed to solve load decomposition least square problem for the steady-state process. The sequential leader clustering algorithm is adopted to classify transient events of the transient process to identify load switching operations. Simulation results prove that the disposed method can effectively decompose the specific loads from unknown composed loads and accurately identify the load switching operation with an error rate of about 10−16. Furthermore, a general process of load identification for industrial users is constructed, including feature extraction, data preprocessing, steady-state, and transient load decomposition-identification. On this basis, the detailed operation information of various loads can be acquired from overall load signals. Nevertheless, it can acquire detailed operations of various loads. The algorithm introduced is based on the industrial users’ load signal collected by an oscillograph recorder, and further research on dynamic real-time signal acquisition and online analysis is still needed.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was funded by the Science and Technology Project of State Grid (Research and Typical Application of Nonintrusive Load Identification and Perception Technology for Special Transformer Users, 5600-202024168A-0-0-00).