Abstract

In this paper, an innovative approach based on an artificial neural network (ANN) load forecasting model to improve the distribution system state estimation accuracy is proposed. High-quality pseudomeasurements are produced by a neural model fed with both exogenous and historical load information and applied in a realistic measurement scenario. Aggregated active and reactive powers of small or medium enterprises and residential loads are simultaneously predicted by a one-step ahead forecast. The correlation between the forecasted real and reactive power errors is duly kept into account in the definition of the estimator together with the uncertainty of the overall measurement chain. The beneficial effects of the ANN-based pseudomeasurements on the quality of the state estimation are demonstrated by simulations carried out on a small medium-voltage distribution grid.

1. Introduction

Power systems are rapidly evolving, and, correspondingly, control and management systems need information on the actual operating conditions of the grids, in order to manage them effectively. This information can be obtained in several ways and concerns different types of data, but, in any case, the proper operation of the systems is strictly connected to an accurate knowledge of the operating conditions of the grid [1]. Knowledge on possible operating conditions of the network can be obtained by means of load flow and power flow methods (see, for example, [24]), which are used, for instance, starting from nominal data in the planning stage or to find optimal configuration during contingencies. Other methods, suitable to know the state of the network, in terms of node voltages and/or branch currents, at run time starting from available measurements can be based on state estimation (SE) techniques. SE methods applied to the electrical power systems date back to the 1970s, when Fred Schweppe proposed the use of SE for achieving an accurate picture of the operating conditions of transmission networks [5]. The conventional state estimators assume the monitoring system to be overdetermined by having redundant measurements, thus ensuring the system observability, which is crucial for the state estimator to work. Unlike transmission systems, the number of real-time measurements in distribution systems (DSs) is usually limited. Despite the recent vast deployment of smart meters (SMs), the monitoring systems of DSs are still underdetermined; moreover, possible metering or communication problems causing missing/delayed data contribute to real-time measurements scarcity. For DSs, SE is not feasible unless the so-called pseudomeasurements are introduced in order to ensure the observability of the system and to perform, thus, the so-called distribution system state estimation (DSSE). In reviews [6, 7], it is possible to find the context of research activity on DSSE techniques. In the following, the focus will be on the definition of pseudomeasurements for DSSE. Typically, pseudomeasurements are indirect or derived measurements used to describe load/generator power absorption/generation. It is worth recalling that loads in electric DS have highly different location, size, and typology, and distributed generation can be highly variable. Thus, an accurate idea of the behaviour of the overall load configuration is challenging and can be essential for an effective DSSE [8]. In order to allow DSSE to function properly, pseudomeasurements from load estimates and short-term load forecasts can be used [9, 10]. In recent times, research has been carried out on the application of machine learning techniques both for load estimation and load forecasting problems.

Traditional load power estimation can be performed from typical load profiles of each user type. Assuming the demand at each hour is equal to the demand at the same hour of the previous equivalent day is a common choice. The load profiles can be considered as derived estimates of the customer load behaviour with high variance. Consequently, the quality of the state estimates obtained with this approach is generally poor. To improve the accuracy of load power consumption estimation in [11], a neural network approach is proposed to realign the average load profiles with the real power flow measurements available at the network substation. In particular, for each bus, two feed-forward artificial neural networks (ANNs) are trained, the first one associates real active power flow measurements with active power injections, whereas the second one relates real reactive power flow measurements with reactive power injections.

In the literature, algorithms based on Multiple Linear Regression (MLR) analysis [10] and machine learning techniques [12] have been proposed for load forecasting.

Load forecasting is a difficult task as the time evolution of the loads is complex and exhibits several levels of seasonality; the load at a given hour can be correlated not only with the load at the previous hour but also with the loads at the same hour on the previous day and with the same day in the previous week. Moreover, many important exogenous variables can be considered, especially weather-related variables. In this context, ANNs with their inherent capability to infer a function from data represent an efficient solution for load modelling more than linear models. Generating pseudomeasurements using ANNs for load forecasting has been demonstrated to be a viable solution especially for local-level load modelling [12]. In [13], a closed-loop estimator is proposed for a medium-voltage DS, where a machine learning function provides pseudomeasurements to DSSE. The output of the estimator is then fed back to the machine learning function, which allows the estimation when measurement data are missing. For each medium-voltage (MV) node, two feed-forward ANNs are trained to independently forecast nodal active and reactive power. Load time series and indices categorizing the load time series according to the load patterns are used as ANN inputs. The ANNs are retrained over time whenever a new load time series of an MV node is available. In [14], a SE tool with closed flow between the estimator and the machine learning function is proposed as in [13]. In [14], a feed-forward ANN is trained to forecast active power value one step ahead with three types of inputs: historical load information and weather-related and time-related variables. Once the predictive model has been developed, its performance is monitored continuously in order to detect the deterioration over the medium-to-long term (i.e., weeks to months) and retrain it accordingly.

Among all the proposed machine learning methods, multilayer perceptron (MLP) neural networks demonstrated to perform better than others for load forecasting [11, 15]. In particular, in [15], five machine learning methods, i.e., MLP, Support Vector Machines, Radial Basis Functions, Decision trees, and Gaussian Process, have been compared to forecast the active power charges for the next 100 hours. The comparison has been made using three measures of accuracy (MAPE—Mean Absolute Percentage Error; MAE—Mean Absolute Error; and RMSE—Root Mean Squared Error) showing that the MLP is the most robust among the others.

In this context, this paper proposes an MLP load forecasting model for generating simultaneously high-quality active and reactive power pseudomeasurements for an effective branch-current-based DSSE (BC-DSSE). The estimator uses all the available information suitably considering uncertainty sources and correlations. In particular, appropriate weights are introduced to take into account the SM and forecast uncertainties. A realistic measurement scenario composed of few real measurements, SMs, load forecasts, and suitable pseudomeasurements is assumed. In particular, a feed-forward ANN is trained to predict one step ahead the power demands at each MV node. Exogenous variables have been used as predictive model inputs together with historical load information. A closed-loop information, flowing from the ANN outputs to the inputs, has been created to allow the BC-DSSE even though SM measurements are not available at the MV node for the last 24 hours.

The approach proposed in this paper is validated by means of simulation performed on a grid derived from a portion of a distribution network [16], which is a simplified version of an 18-bus UK radial feeder.

2. Distribution System State Estimation

DSSE is the key routine to obtain a picture of the network status at a given time instant. The system is locally considered under steady-state conditions, and the underlying measurement model can be described as follows:where is the vector of available measurements ; is the vector including all the variables that uniquely define the state of the network; and represents all the measurement functions (generally nonlinear) linking the reference measured values to the state of the system. Vector includes all the measurement errors and is considered a zero-mean random vector. The state can have different formulations (the so-called voltage state or current state, either in polar or rectangular coordinates), which are equivalent from a theoretical perspective but can lead to advantages in the implementation of DSSE solution routines. Hereafter, the following state is considered (see the BC-DSSE algorithm in [17]):where is the voltage magnitude at a node of the network, which is chosen as the reference (e.g., the slack node), while and are the real and imaginary parts of branch current, with the branch index ranging from 1 to the number of branches . This branch-current formulation allows linearizing some of the measurement functions and thus simplifying the estimation process.

From (1), it is clear that, to estimate the network state, it is necessary to manipulate the information coming from measurements in , also taking into account the characteristics of the measurement errors. As mentioned above, in DSs, it is hard to have a widespread installation of measurement devices and thus the availability of real-time measurements is typically limited. The measurement vector can thus be divided as follows:where includes the real-time measurements, which can be voltage magnitude, current magnitude, and active or reactive power measurements as far as conventional measurements are concerned. Vector represents information that can be derived from other sources (pseudomeasurements), mainly historical information on active and reactive power consumption or generation. Every load or aggregated load of the network is thus analysed to define its average power absorption or injection. Pseudomeasurements are necessary to allow system observability in this context, but their accuracy is usually very low.

The most widespread technique to perform DSSE is represented by the Weighted Least Squares (WLS) approach, which aims at finding the state that minimizes the following objective function:which is the sum of the weighted squared residuals. Matrix represents the weighting matrix that allows penalizing or favouring each residual in a different way depending on the accuracy of the corresponding measurement or pseudomeasurement. is usually chosen as the inverse of the variance-covariance matrix of the measurements so that all measurements are weighted differently depending on their uncertainty (corresponding to a maximum likelihood estimation in case of normally distributed measurements), which is typically block-diagonal, since measurements performed by different devices can be considered uncorrelated.

The minimization in equation (4) is typically obtained by an iterative Newton solution of the following system of normal equations (considering the generic iteration ) [9]:where is the estimated state variation and is the Jacobian of the measurement functions in with respect to the state variables computed at the previously estimated state. The so-called gain matrix can be written asand it is constant when measurement functions are linear or can be linearized (see [17] for examples and details).

Pseudomeasurements are employed similarly to other measurements but are usually associated with very large uncertainties (e.g., derived from historical variability of loads), which result in small contributions to the objective function in equation (4). Focusing on active power, a typical approach is to consider the following quantities for an unmonitored network node j:where and indicate the active power pseudomeasurement and its standard deviation, respectively, is the active power injected (the injection convention is used for both absorbed and generated power) into node j, and defines the time resolution of the available information (historical data). The set includes the considered time instant indices (with indicating the generic time instant) for all the available power samples, and is its cardinality decreased by one to obtain the classical unbiased variance estimator. The higher the variability of the load/generator, the larger the corresponding variance, which is inversely proportional to the weight associated with the pseudomeasurement. A typical variation range of the power drawn by a load is over 50% of its nominal value, so it is easy to see that pseudomeasurements, while guaranteeing observability, are often of little help in improving the estimation accuracy.

In this context, SMs, and, in particular, those of 2nd generation, can play a significant role in enhancing pseudomeasurement definition. New SMs can provide voltage magnitude and active and reactive power measurements, with a much faster reporting rate than before. Italian authority for electric energy, for instance, gives and continuously updates directives on the functionalities for new-generation SMs, which include, among others, a 2 s measurement interval for “instantaneous” power measurements [18]. These measurements might be, in the future, directly acquired and integrated into estimation algorithms, but due to the huge number of installed SMs (above 30 million in Italy), it would be difficult to directly manage them in real-time and investment costs for communication and computation could easily become overwhelming.

For this reason, it is much more likely that SM measurements are used in an indirect and delayed way. The approach proposed in this paper is to collect active and reactive power measurements from SMs and exploit them to forecast the power consumption at a given time with anticipation compatible with the timing and data collection requirements. To this purpose, load forecasting Artificial Neural Network (ANN) models have been trained for different types of loads and/or aggregation of loads, as detailed in the following section. In fact, SM measurements from large customers or from a set of users connected to a given node (for instance, in a medium-voltage network) can be gathered from the field one day or few hours ahead the time instant of BC-DSSE execution and aggregated, so that they can serve as inputs in the neural load forecasting process.

As previously mentioned, once the forecast power consumption or generation of a given node is obtained, it can be included in as an enhanced pseudomeasurement and it is thus important to associate the correct covariance matrix to the new forecast quantities, thus allowing a correct weighting of the corresponding residual in the WLS procedure, according to equation (4).

Focusing on a generic bus j, the forecast procedure gives two predicted quantities and , which are the active and reactive power injections, respectively. The following covariance matrix is thus needed:where and are the variances of new pseudomeasurements and , while is the covariance between the pseudomeasurement errors of the two forecast powers and is the corresponding correlation coefficient. It is interesting to notice that, while in conventional DSSE approaches, the active and reactive powers pseudomeasurements are usually considered as uncorrelated, in the proposed approach, further information is taken into account and the correlation arising in the simultaneous estimation of and can be easily included in the estimator by using the submatrix in the overall weighting matrix.

Another important aspect, which is usually overlooked in the literature, is the modelling of the SM uncertainty. The proposed load forecasting is designed to obtain an estimation of the measured and at a given time instant from previous available measurements, but this means that the computed power values can be considered only as approximations of the measured values (reference values are obviously unknown in practical conditions and real-time operation). The SM measurement chain is an additional source of uncertainty that affects the values considered in equation (8). As an example, the calibration process of SM devices cannot be perfect and compensate for all the systematic errors in all the operating conditions.

For this reason, in the following, the definition of the weights is discussed in detail in the presence of both forecast and SM errors. Focusing on and , it is possible to distinguish the two zero-mean error contributions as follows:where and are the ideal reference values of active and reactive power, and are the corresponding forecast errors, and and are the errors associated with the aggregated SM outputs. In the following, in the absence of further information, all the errors of the SMs associated with loads or generators grouped under the node are considered independent. Similarly, the active and reactive power measurement errors are assumed uncorrelated.

Given the relative standard uncertainty of the generic SM (as derived from the SM datasheets and assumed as common to all SMs, without loss of generality), the relative standard uncertainty associated with becomeswhere is the measured active power of load , which belongs to the set of the loads downstream the MV node . Since the lack of knowledge in SM behaviour can be considered as independent from the prediction errors, the overall variance of the measurement can be expressed as follows:

Similar expressions are valid for the reactive power while the correlation coefficient , under the above assumptions, becomes as follows:where is the correlation coefficient of the active and reactive power forecast errors.

The way these values are computed in practice is explained in the following section, where information available at each step is discussed. The above equations allow the computation of the elements of (see equation (8)) and thus, the definition of the weights for BC-DSSE that reflect appropriately the actual uncertainty in the proposed pseudomeasurement model.

3. Load Forecasting Neural Network Model

A Multilayer Perceptron (MLP) ANN with one hidden layer is used to forecast the load demand one step ahead (roughly speaking “ prediction,” where the actual time step depends on the chosen ANN model in terms of input and output variables), where the time interval for prediction update is assumed equal to half an hour. In particular, residential and Small and Medium Enterprises (SMEs) loads have been considered in this paper. An MLP has been trained for each single or aggregated load. It has to be noted that, while for the residential loads, only the active power has been forecasted, as in the majority of the literature, for the SME loads, the corresponding reactive power is also considered in the proposed model. Figure 1 shows the structure of the MLP neural network for a SME.

The relationship between input and output patterns is described by the following algebraic equations system:where is the input vector, which contains variables related to the time instant :(i)Two weather variables at the time instant : temperature, measured in °C, and humidity, in percentage;(ii)Three time-related variables: these consist of a label corresponding to the hour of the day , a label for the day of the week , and a label , where indicates a working day and indicates a nonworking day;(iii)Four historical demand variables, which have a strong correlation with the recorded demand profile (active power for residential loads, whereas both active and reactive powers for SMEs): the demand at the previous hour, the demand at the same hour of the previous day, the demand at the same hour of the previous week, and the 24-hour average power evaluated considering all the recorded values in the previous day. As the considered time step is half an hour, the variables at the previous one hour, one day, and one week correspond to 2 steps, 48 steps, and 336 steps back.

The output of the network consists in the one-step ahead forecast load demand.

At time instant that corresponds to the current time tag of DSSE computation update, information on active and/or reactive powers of node is needed. In the proposed solution, if the ANN forecast model is available for node , powers and/or are obtained, thanks to the performed prediction. In this case, the instant indicated as “t + 1” in the forecast model description (see Figure 1) corresponds to the current time instant . For ease of presentation, a perfect match has been here adopted between and the time interval of prediction update (half-hour step for both prediction and estimation updates), but other solutions at different rates are also possible following a similar scheme. As mentioned above, an ANN model is built for all the loads of interest and its outputs (load predicted powers) are fed into the DSSE algorithm at the following time step, with the procedure described in the previous section.

is the weight matrix of the input layer, is the bias vector of the input layer, is the input of the hidden layer, is the output of the hidden layer, is the hidden neuron (logistic) activation function, is the weight matrix of the output layer, and is the bias vector of the output layer.

The Levenberg–Marquardt algorithm [19], which combines the gradient descent method and the Gauss–Newton method, has been used for the MLP training. The hyperbolic tangent sigmoid transfer function is used in the hidden layer. The inputs and outputs are normalized in the range before being used to train the ANN to balance the importance of input variables.

In the following, the case study and the database used to train and test the forecasting models are described.

4. Case Study

To evaluate the performance of the proposed approach, several tests have been carried out on a single-phase 18-bus network derived from a UK network (Figure 2) [20]. Connected to the 33 kV at bus 1, the network has a common rated bus voltage level at 11 kV. This network is used since it is adopted for other studies in the literature and gives a realistic load scenario for a MV network. On this topology, both industrial and residential loads have been considered as explained in detail in the next section. There is no loss of generality in considering loads information (SM measurements) coming from the database described in the next section, since individual loads are aggregated to replicate a load scenario that is compatible with nominal data of the considered network in [20]. Main assumptions and test results are also reported and discussed in the following.

4.1. Database for the Load Forecasting

To train and test the neural networks used to perform the load forecasting, data related to the active power consumption, available from the Commission for Energy Regulation (CER) [21], have been used. This database is anonym, and it consists of recorded half-hourly SM energy consumption from 6445 customers that participated in the “Electricity Smart Metering Customer Behavior Trials” [22]. The data are collected over a period of 18 months (from July 14, 2009, to December 31, 2010), at various distribution network locations in Ireland. The customer types are classified as residential (4225), SME (485), and others (1735). The present paper focuses on residential (the largest group in the available database) and SME customers who completed the trial. The SME customers are grouped into four subsectors: entertainment (including hotels, restaurants, sporting facilities, and public houses), industrial manufacturing, offices, and retail premises.

After removing the consumers having missing data, a database of 3423 residential and 287 SME consumers has been obtained. For SME loads, the reactive power profiles have been obtained starting from a real power factor (PF) profile recorded on a typical industrial site. In particular, the recorded PF profile has been propagated on the whole observation period. No reactive power was taken into account for residential loads.

Analyzing residential and SME data, a significant difference between the two consumer types can be observed. Figure 3 shows the power consumption (, where the node index is dropped in the following when unnecessary, for the sake of simplicity) of four residential loads, randomly selected from the database. As can be noted, the energy consumption of each household is low, not regular, and very different in two consecutive working days as it depends on the lifestyle of its residents. On the contrary, the SME loads (Figure 4) appear typically high and mostly regular due to the regular activity during the working hours and working days. Therefore, a different load forecasting performance for these two subsets is expected.

Since the objective of this paper is an effective state estimation of a DS, different equivalent MV loads have been created aggregating loads from the database with the aim to obtain power levels compatible with those of the considered grid (Figure 2). The load aggregation has been performed simply summing together the readings of several SMs. In particular, three different SME loads (, , and ) are obtained by aggregating 21, 26, and 46 individual SMEs data, respectively. Moreover, two different residential loads ( and ) are obtained by aggregating, respectively, the energy consumption of 523 and 537 residential loads randomly chosen. In Table 1, the ranges of the active power of the aggregate loads are reported.

Figure 5 shows the power consumption of one aggregated residential () and one aggregated SME () load in the same two consecutive days considered in Figures 3 and 4. As expected, the aggregation makes the load profile more predictable. In fact, especially in the case of residential loads (Figure 5(a)), the agregate load is more periodic and smoother than individual ones. This is because the aggregation operation permits to remove the high-frequency impulses corresponding to random events in the individual curve, alleviating and smoothing the randomness. Regarding the aggregated SME load (Figure 5(b)), its periodicity is more evident because the individual SME loads are already more periodic than individual residential loads.

The demand profiles depend not only on historical load evolution but also on exogenous variables, such as season and weather-related variables. Therefore, weather data, collected from the Irish Meteorological Service [23], have been added to the database. Among the weather variables, in this paper, the temperature (in °C) and the humidity (in percentage) have been chosen as neural model inputs. This is because when the value of the temperature varies, the power system demand also varies. Furthermore, the humidity plays a relevant role in driving electricity demand during the warm months. In fact, the temperature above certain values is intensified by high humidity [24]. As the aggregation has been performed by selecting individual loads located in different Irish areas, the simple averages of the temperature and the humidity percentage measured by several weather stations located in the central area of Ireland are used to represent the corresponding weather variables.

To highlight the dependence of the load energy consumption on the weather data, Figure 6 reports the daily energy consumption of and and the average daily temperature trend over the same period. As can be noted, both residential and SME aggregated loads show a time pattern dependent on the temperature, with a stronger dependence for the residential load. Obviously, the dependence of the load energy consumption on the temperature data, even if prevailing, is not the only one. In fact, the drastic reduction in the consumption of SME loads and the increase in the residential one at the end of December 2009 are mainly related to the Christmas-New Year period rather than the temperature. In this paper, the dependence of the load energy consumption to the weather data is demonstrated by evaluating how the performance of the forecasting model is affected when the weather variables are excluded from the inputs of the model. The results are shown in the following section. Other weather variables, such as precipitation and wind speed, were analyzed, but it was found that, in this case, they do not have a significant impact on the energy demand.

As Figure 6 highlights, the considered energy consumption time series shows strong regularity, and a spectrum analysis revealed a prevalent daily periodicity, but at the same time, it is decidedly nonlinear. This complex data behavior can be captured by a MLP, which owns the ability to construct a larger set of nonlinear input/output mapping, by combining an appropriate number of nonlinear activation functions. Therefore, in this case study, a simple MLP, with a suitable structure, can be enough to build a performing forecasting model avoiding the overfitting of the training data.

4.2. Performance of the Load Forecasting Models

A neural load forecasting model has been designed for each aggregated load characterized by the range reported in Table 1. In order to optimize the load forecasting network architecture, a trial-and-error approach has been performed to choose the appropriate number of hidden layer nodes, which consists in progressively growing the number of nodes, and selecting the network that minimizes the prediction error on the validation set. This optimization procedure resulted in 20 neurons for all the five networks (associated with loads in Table 1). Therefore, the best MLP architecture consists of an input layer with one neuron for each input variable (thus 9 or 13, for residential or SME, respectively), one hidden layer with 20 neurons, and an output layer with one neuron for each output variable (1 or 2 for residential or SME, respectively). Thus, the dimensions of the weight matrices and bias vectors in equation (14) result in for , for , for , and for in case of residential loads. In case of SME loads, and are and , respectively, whereas and are and , respectively.

The time series of each load profile is composed of 25728 half-hourly active and reactive (when considered) power values, from August 1, 2009, to December 31, 2010, while the July 2009 data were not used as only the recordings of fifteen days were available. The MLP training has been performed using the data of the first 12 months. The validation has been performed using the following 2 months (from August 1, 2010, to September 30, 2010). The last 3 months (from October 1, 2010, to December 31, 2010) have been used to test the trained neural model.

Since the forecasting accuracy depends both on the quality and quantity of the historical data used to train the predictor, a greater amount of data, for example, an extra year would certainly improve the prediction performance.

Note that a realistic assumption about the monitoring architecture could be that data from SMs are actually available after 24 hours. However, the proposed solution can also work with data collected one hour before (that could represent a future-proof scenario). Thus, during the training of the neural model, this information, both for active and reactive powers, has been included among inputs, because it is always available in the offline phase. In the online test phase, the corresponding inputs have been replaced by the values forecasted by the neural predictor at the previous steps. The outputs of the predictor are then fed back to the input layer creating a closed-loop information flow. This allows the state estimation even when measurement data are missing.

To evaluate the performance of the predictive models, the MAE, the MAPE, and the Root Mean Square Percentage Error (RMSPE), defined as in the following, have been used:where is the actual load value, which can thus represent either the active power or the reactive power of the considered load (respectively, and when referring to the network nodes as in the DSSE section above); is the corresponding predicted load value; and is the number of training or testing samples. The smaller the values of MAE, MAPE, and RMSPE are, the better the forecasting performance is.

Figure 7 shows (in the top) the actual (black line) and the predicted (red line) active power load time series and (in the bottom) the corresponding differences between predicted and actual load powers for a month (October) of the test set related to . Figure 8 reports the same time series for . As can be noted, the trends of the two loads are efficiently modeled by the neural predictors.

Figures 9 and 10 report the behavior of the actual and predicted real power load time series (top) and the corresponding prediction error (bottom), for the and respectively, during the first test week. The MAPE for the forecasted active power in this time window results in 4.7% for the and 4.8% for . Moreover, the validity of the zero-mean hypothesis has been verified for errors of both active and reactive power prediction.

In Table 2 and in Table 3, the training and test performances, obtained for the SME and residential aggregate loads, are reported, respectively.

It can be noted that, as expected, the performance deteriorates in the test phase. The SME loads show that the error percentages of the active and reactive power are very similar. The correlation coefficients between the real and reactive power errors are then evaluated to be used in the following, for the state estimation.

The results on the test set show that the proposed predictive model is able to forecast simultaneously both active and reactive powers (when required) with limited errors, starting from both exogenous and historical measurements. Moreover, it overcomes the problem of limited or time-delayed historical measurement availability throughout a closed loop information flow, which replaces the missing data with values forecasted by the predictor itself at the previous step.

The influence of the input variables on the load energy consumption can be evaluated through the performance of the forecasting model. In fact, the performance of the neural network model is expected to deteriorate when an effective variable is excluded from the inputs. In this paper, since aggregated loads show a time pattern dependent on the temperature, highlighted in Figure 6, the dependence of the load energy consumption on the weather data has been assessed. Removing the weather variables, the RMSPE on the test set increases by about 2%, for both active and reactive powers for SME costumers, and by about 1% for the residential ones (which are significant variations with respect to the results reported in Tables 2 and 3).

5. Performance of the Distribution Systems State Estimation

To assess the estimator performance, several simulations have been carried out starting from a measurement scenario that is realistic for a distribution grid. Two measurement points have been assumed on the network: on bus 1, with a magnitude voltage measurement and an active and reactive power flow measurement; on bus 4, with a magnitude voltage measurement. SMs have been considered providing data fully available the day after the measurements. An accuracy equal to 1% for the magnitude of the voltage and equal to 3% for the power flows and SM measurements have been assumed.

A Monte Carlo approach has been applied in order to obtain statistically sound results, and the following assumptions are made:(i)Number of Monte Carlo trials, NMC = 1000(ii)A maximum deviation of 50% with respect to the nominal values for the active and reactive powers drawn by the loads (uniform distribution)(iii)Measurement errors uniformly distributed

The SME loads , , and have been connected to buses 17, 14, and 7, respectively (and thus are described by the couples ), while residential loads and have been associated with buses 3 and 12 (and thus associated with and ). As for these loads, the last 1000 values of the test set have been considered, which correspond to a temporal interval of about 21 days. For each instant, a different operating condition of the network is thus considered by using such values for SME and residential loads and extracting the reference values applying the SM uncertainty. For all the other loads, active and reactive powers are extracted from nominal values according to the above assumption. Then, all reference values are computed from these load conditions by means of load flow calculation. Finally, measurements are also extracted from their random distribution and used as inputs to the BC-DSSE.

To assess the performance of the estimator, two different formulations and configurations of the BC-DSSE that correspond to different computation and management of the pseudomeasurements have been adopted. The first one, which uses the proposed estimator, exploits the predictions of the loads coming from the corresponding neural load forecasting models. This case is indicated as “Prediction” in Figures 1113 reporting DSSE results in the following.

As mentioned in the previous section, the forecast active and reactive powers of SME loads ( and , with ) have been used for each instant together with the forecast active power of residential loads ( with ).

To build the weighting matrix, besides the real-time measurement weights, submatrices concerning forecast loads must be included (see equations (9) and (10)) in the BC-DSSE. According to equation (12) (and its counterpart for reactive power), the variance of the pseudomeasurement is computed by using the of the training set for the forecast errors and the datasheet information for the SMs. Since the SM measurements are not available in real time, the relative uncertainty of the aggregated power is evaluated for each time instant as the relative uncertainty of the day before at the same hour and it is associated with the aggregated forecast powers at the current instant. The above procedure has been applied for all the estimations. It is interesting to notice that, with this model, can also be updated at fixed intervals by considering the measurements and forecast data obtained in the meanwhile.

The second formulation corresponds to the classical BC-DSSE where no forecasting is considered, and pseudomeasurements of nodes are directly computed from the available measurement data. In particular, the measured power values collected the day before at the same hour are used as pseudomeasurements. This estimator thus does not apply predictions (and is referred to as “No prediction” algorithm) and is considered as a benchmark for the proposed method in the same network scenarios.

Once the state variables (branch currents) are estimated along with derived quantities (e.g., voltages and power flows), a comparison of the results obtained with the two methods is performed in terms of percent root mean square errors (RMSEs) of the estimations (i.e., the square root of the mean of the squared differences between the estimated quantities and the corresponding reference values). RMSE results of the branch-current magnitude estimations are presented in Figure 11. The bar plot in red (dash line) shows the results obtained considering the prediction of the loads, while the bar plot in grey (the same holds for the Figures 12 and 13) presents the results obtained considering the above-described pseudomeasurements. A reduction close to 12% (meaning that the error is about halved) has been obtained as a best case on branch 6, where the largest forecast load is connected, and an average reduction of more than 3% is also obtained. It is clear that the reductions in the estimation errors are more evident close to the position of the forecast loads. Branches 12 and 13 clearly show the same accuracy results, since node 13 is a zero-injection node. The same holds for the pairs 7-8 and 14-15.

Figure 12 shows the results obtained in terms of percent RMSE of the active power flow estimations for all the network branches. The bar plot in violet (dash line) shows the results obtained considering the prediction. The considerations that can be drawn by these results are similar to those obtained for the branch currents: estimation improvements are more evident for the branches that are close to larger predicted loads. The error reduction is also more effective when lateral branches or leaves of the network are considered. In this case, the proposed algorithm brings a maximum reduction about 11.5% (error reduction of about 44%). Moreover, an average reduction of more than 3.3% is obtained.

As for reactive power estimations, it is possible to see in Figure 13 that the estimations are mainly affected by the prediction of the industrial loads locally, since the reactive power forecast is also available for them. A reduction of the percent RMSEs larger than 13% is obtained at branches 12, 13, and 16, while it is larger than 15% (the estimation error is more than halved in this case) for branch 6.

The test result highlights how the distribution state estimation performance significantly improves, introducing as pseudo-measurements the active and reactive power forecasted by the neural predictors instead of the power consumptions measured at the same hour of the day before. The improvements are more evident for branches close to larger loads.

6. Conclusions

Neural network load forecasting models demonstrated to produce reliable input information for a distribution state estimator, overcoming the problem of limited and time-delayed SM measurements or temporary failure in the communication system. In order to improve the accuracy of the state estimation, different requirements have been fulfilled: (i) the neural models are able to forecast simultaneously both active and reactive powers with limited errors, starting from both exogenous and historical measurements; (ii) the correlation between the forecasted real and reactive power errors has been determined, which results in significant information for the state estimation algorithm; (iii) a closed loop information flow allows the load forecasting, and hence, the state estimation, even when real measurement data are missing by replacing them with forecasted values; (iv) to build effectively the weighting matrix, needed to solve the state estimation algorithm, the variance of the pseudomeasurements can be updated at fixed intervals by considering the measurements and forecast data obtained in the meanwhile.

The test results show that introducing pseudomeasurements forecasted by the neural predictors significantly improves the DSSE and, more importantly, the improvements are more evident for the branches that are close to larger predicted loads almost halving the percent RMSEs of power and current estimations.

In summary, the proposed approach can be used for the state estimation of medium-voltage distribution networks that are either underdetermined, due to limited real-time measurements, or overdetermined but with delayed measurements from SMs.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research work has been supported by “Fondazione di Sardegna” within the research project “SUM2GRIDS—Solutions by mUltidisciplinary approach for intelligent Monitoring and Management of power distribution GRIDS”—Triennal Agreement between Fondazione di Sardegna and Sardinian Universities, Regione Sardegna—L.R. 7/2007 year 2017–DGR 28/21, 17.05.2015. The article charge of this work was supported by the Open Access Publishing Fund of the University of Cagliari, with the funding of the Regione Autonoma della Sardegna L.R. n. 7/2007.