Abstract

Traditionally, the performance maps and emissions of a diesel engine are obtained empirically through many testes on the dynamometers because no exact mathematical engine model exists. In the current literature, many artificial-neural-network- (ANN-) based approaches have been developed for diesel engine modelling. However, the drawbacks of ANN would make itself difficult to be put into some practices including multiple local minima, user burden on selection of optimal network structure, large training data size, and overfitting risk. To overcome the drawbacks, this paper proposes to apply one emerging technique, relevance vector machine (RVM), to model the diesel engine, and to predict the emissions and engine performance. With RVM, only a few experimental data sets can train the model due to the property of global optimal solution. In this study, the engine speed, load, and coolant temperature are used as the input parameters, while the brake thermal efficiency, brake-specific fuel consumption, concentrations of nitrogen oxides, and particulate matter are used as the output parameters. Experimental results show the model accuracy is fairly good even the training data is scarce. Moreover, the model accuracy is compared with that using typical ANN. Evaluation results also show that RVM is superior to typical ANN approach.

1. Introduction

Air pollution is one of the most challenging problems today in many cities. The increased use of motor vehicles causes the amount of exhaust emissions to increase dramatically, which makes the problem more serious. Reducing the exhaust emissions from engines has then become an important concern of governments and motor vehicle manufacturers. Moreover, in view of the increasing oil price and the need to reduce emission of the global warming gas CO2, there is a demand to reduce fuel consumption while maintaining the engine performance. Therefore, many researchers have focused on the relations between these two issues, namely, engine performance and emissions.

Diesel engines, though having the advantages of high fuel efficiency and high durability when compared to other engines, are the major source of nitrogen oxides (NO𝑥) and particulate matter (PM), which are harmful to human health and the environment. In particular, the fine and ultrafine particles (~10 micrometers or less) emitted by diesel engines can accumulate in the human respiratory system and cause various health problems [1] and influence global climate by absorbing solar radiation and reacting with other atmospheric constituents [2, 3]. Diesel engines are used extensively in buses and trucks; thus they are the major road-side emitters, posing significant threat to the health of the road users. In order to reduce these emissions, the combustion process of the engines has to be controlled. Additional hardware and instruments must be installed to monitor and control the engine operating parameters. Many experiments and tests must also be conducted to obtain a comprehensive understanding on the performance and emissions of the diesel engine. These are very complicated, time consuming, and expensive [4].

A way to solve these problems is to create a mathematical model for the diesel engine so that all the costly and immeasurable data can be predicted and virtual sensors can be used to replace the costly sensors. However, the combustion process of a diesel engine is too complex that an exact mathematical model still does not exist today. Figure 1 shows one example of diesel engine performance map with only three variables in which the relationship among the engine load (torque), engine speed, and brake specific fuel consumption is already highly nonlinear. It could be imagined that if more variables are studied together, the model will be very complicated and very difficult to obtain. Moreover, the mathematical model varies for different engines.

In general, black-box identification is one of the commonly used modelling techniques suitable for engines because it can manage complex and uncertain information. Many recent researches in black-box identification have described the use of artificial neural network (ANN) for modelling of diesel engine performance [59] and emissions [7, 8, 1013] based on experimental data sets. The ANN has in general, however, three main drawbacks for its learning process [14].(1)The architecture, including the number of hidden neurons, has to be determined a priori or modified while training by heuristic, which results in a suboptimal network structure.(2)The training process (i.e., the minimization of the residual squared error cost function) in ANN can easily become stuck in local minima. Various ways of preventing local minima, like early stopping, weight decay, have been employed. However, these methods greatly affect the generalization of the estimated function (i.e., the capacity of handling new input cases).(3)The amount of training data is usually large. Normally at least 200~400 sets of training data is required to build an accurate ANN engine model [15]. However, the collection of diesel engine emission and performance data is usually time consuming and costly, so the data set is usually lower than 50, resulting in that ANN may not be a good solution for diesel engine modelling.

To overcome the disadvantages of ANN, an algorithm entitled relevance vector machine (RVM) was proposed by Tipping [16]. This approach is an emerging machine learning technique that is able to utilize more flexible candidate models, which are typically much sparser, offer probabilistic prediction, and avoid the need to set additional hyperparameters. The other advantage is that the training algorithm of RVM can ensure a global optimal solution whereas the learning process of ANN may cause a local optimal solution, so ANN requires more training data to minimize the risk [14]. With this good property, RVM is likely not to require too much sample data to build an accurate model. However, one deficiency of this approach is that the training time is approximately in the cube of the sample numbers. Thankfully, a fast training algorithm [17] is developed for RVM which initializes with an “empty” model, and sequentially “add” samples to increase the marginal likelihood, and also modify their weights. Within the same principal framework, the objective function can also be increased by deleting the samples which subsequently become redundant.

Recently, RVM has been applied to system modelling and predictive control [1820]. These researches show that RVM is generally superior to the ANN. Moreover, the application of RVM to modelling of diesel engines under rare data is very few. For these reasons, in the present paper, RVM is employed to model the performance and emission characteristics of NOx and PM of the diesel engine. Experiments are still required to provide sample data for RVM training. To demonstrate the effectiveness of this approach, a neural-network-based diesel engine model is also constructed and compared with the RVM model.

2. Relevance Vector Machine

The procedure of the RVM modelling is introduced here. Consider a training data set 𝐃 of 𝑁 input vectors {𝐗𝑛}𝑁𝑛=1, along with 𝑁 corresponding scalar-valued output {𝑦𝑛}𝑁𝑛=1. The output 𝑦𝑛 is assumed to contain zero-mean Gaussian noise with variance 𝜎2. Hence, the probability of prediction error 𝜀𝑛 for 𝑦𝑛 is a Gaussian distribution of zero mean and variance 𝜎2: 𝑝(𝜀𝑛𝜎2)=𝑁(0,𝜎2), with𝑦𝑛𝐗=𝑓𝑛,𝐰+𝜀𝑛.(1)

That is, 𝑝𝑦𝑛𝐗𝑛,𝐰,𝜎2𝑓𝐗=𝑁𝑛,𝐰,𝜎2,(2)

where 𝑓(𝐗𝑛,𝐰) in (1) is the prediction model for the model output, 𝑦𝑛, with the input 𝐗𝑛 and 𝐰=[𝑤1,,𝑤𝑁] is the weight vector for the RVM model.

The predicted output ̂𝑦 at an input 𝐗 in the kernel model can be represented by=̂𝑦=𝑓(𝐗,𝐰)𝑁𝑖=0𝑤𝑖𝐾𝐗,𝐗𝑖=𝚽𝐰,(3)

where 𝐾(𝐗,𝐗𝑖) is a basis function and Φ is the 𝑁×(𝑁+1) design matrix with Φ=[𝜙(𝐗1)𝜙(𝐗𝑁)]𝑇, wherein 𝜙(𝐗𝑁)=[1𝐾(𝐗𝑛,𝐗1)𝐾(𝐗𝑛,𝐗𝑁)]𝑇. In this research, radial basis function (RBF) is chosen as the basis function 𝐾 because it is commonly used for modelling problems. The approach for estimating ̂𝑦 is to maximize the likelihood in𝑝𝐲𝐰,𝜎2=(2𝜋)𝑁/2𝜎𝑁exp𝐲𝚽𝐰22𝜎2.(4)

The likelihood function in (4) is complemented by a prior over the weights 𝐰={𝑤𝑖}, 𝑖=0 to 𝑁, to control the complexity of the model function and avoid overfitting. The prior is a zero-mean Gaussian probability distribution and is defined over every weight 𝑤𝑖 as follows:𝑝(𝐰𝜶)=(2𝜋)𝑁𝑁/2𝑖=0𝛼𝑖1/2𝛼exp𝑖𝑤2𝑖2.(5)

The hyperparameters vector, 𝜶=[𝛼0,,𝛼𝑁]𝑇, controls how far for each weight, 𝑤𝑖, is allowed to deviate from zero. Consequently, using Bayes’ rule, the posterior over 𝐰 is given as follows:𝑝𝐰𝐲,𝜶,𝜎2=𝑝𝐲𝐰,𝜎2𝑝(𝐰𝜶)𝑝𝐲𝜶,𝜎2,(6)

where 𝑝(𝐲𝜶,𝜎2) is the normalizing factor. 𝑝(𝐲𝐰,𝜎2) and 𝑝(𝐰𝜶) are both Gaussian priors. The posterior mean 𝝁 and covariance 𝚺 are as follows [17]:𝚺=𝐀+𝜎2𝚽𝑇𝚽1,𝝁=𝜎2𝚺𝚽𝑇𝐲,(7)

where 𝐀 defines asdiag(𝛼0,,𝛼𝑁). In fact, the 𝐰 in (3) can be set to the fixed 𝝁 for the purpose of point prediction.

Rather than extending the model to include Bayesian inference over those hyperparameters (which is analytically intractable), a most-probable point estimate, 𝜶MP, may be found via a type II maximum likelihood procedure. That is called sparse Bayesian learning which is formulated as the local maximization with respect to 𝜶 of the marginal likelihood, or equivalently, its logarithm 𝐿(𝜶):𝐿(𝜶)=log𝑝𝐲𝜶,𝜎2=log𝑝𝐲𝐰,𝜎21𝑝(𝐰𝜶)𝑑𝐰=2||𝐂||𝑁log2𝜋+log+𝐲𝑇𝐂1𝐲,(8)

Where𝐂=𝜎2𝐈+𝚽𝐀𝚽𝑇.(9)

The covariance, 𝚺MP=𝚺, can be obtained by substituting 𝜶=𝜶MP into 𝐀 in (7), so that the posterior mean weight, 𝝁MP, is obtained by evaluating (7) again with 𝚺 =𝚺MP, giving a final (posterior mean) approximator:𝐗𝑌=𝑓,𝝁MP=𝑁𝑖=0𝜇MP𝑖𝐾𝐗,𝐗𝑖=𝑁𝑖=0𝜇MP𝑖𝐗exp𝐗𝑖𝜎2,(10)

where 𝑌 is the prediction of the model output with the unseen input data 𝐗. One crucial observation is that typically the optimal values of many hyperparameters are infinite [16]. With (7), this leads to a parameter posterior infinitely peaked at zero for many weights 𝑤𝑖 with the consequence that 𝝁MP correspondingly comprises very few nonzero elements.

A recent analysis has showed that 𝐿(𝜶) has a unique maximum with respect to 𝛼𝑖 [16]:𝛼𝑖=𝑠2𝑖𝑞2𝑖𝑠𝑖if𝑞2𝑖>𝑠𝑖(11)𝛼𝑖=if𝑞2i𝑠𝑖,(12)

and from these, it simply follows:𝑠𝑚=𝛼𝑖𝑆𝑖𝛼𝑖𝑆𝑖𝑞𝑖=𝛼𝑖𝑄𝑖𝛼𝑖𝑆𝑖.(13)

Note that when 𝛼𝑖=, 𝑠𝑖=𝑆𝑖 and 𝑞𝑖=𝑄𝑖, then, it is convenient to utilize the Woodbury identity to obtain the quantities of interest:𝑆𝑖=𝜙𝑇𝑖𝜎2𝐈𝜙𝑖𝜙𝑇𝑖𝜎2𝐈𝚽𝚺𝚽𝑇𝜎2𝐈𝜙𝑖,(14)𝑄𝑖=𝜙𝑇𝑖𝜎2𝐈𝐲𝜙𝑇𝑖𝜎2𝐈𝚽𝚺𝚽𝑇𝜎2𝐈.(15)

The results of (14) and (15) imply that(1)if 𝜙𝑖 is included in the model (i.e.𝛼𝑖<) yet 𝑞2𝑖𝑠𝑖, then 𝜙𝑖 can be deleted (i.e., set 𝛼𝑖 to ∞);(2)if 𝜙𝑖 is excluded from the model (𝛼𝑖 = ∞) and 𝑞2𝑖>𝑠𝑖, 𝜙𝑖 can be added (i.e., set 𝛼𝑖 to some optimal finite values).

To train and update the RVM model dynamically, a sequential learning algorithm is required. The algorithm starts with an empty model, and sequentially adds basis functions to increase the marginal likelihood, and modify their weights. Within the same principal framework, the likelihood can also be increased by deleting those basis functions which subsequently become redundant. Since this algorithm sequentially adds or deletes the basis function to or from the model, the likelihood can be continually increased by adding and deleting basis function and this mechanism make online model update feasible. The steps of the sequential learning algorithm are shown below.(1)Initialize 𝜎2 to some sensible values (e.g., var [𝐲] × 0.1) and all other 𝛼𝑖 are notionally set to infinity. (2)Initialize 𝑆𝑛 and 𝑄𝑛 with a single basis vector 𝜙𝑖 from (13) and (14) and compute new 𝛼𝑛 from (11) which can be simplified as 𝛼𝑖=𝜙𝑖2𝜙T𝑖𝐲2/𝜙𝑖2𝜎2.(16)(3)Explicitly compute 𝚺 and 𝝁 (which are scalars initially), along with initial values of 𝑠𝑖 and 𝑞𝑖 for all 𝑁 basis functions 𝜙𝑖.(4)Select a candidate basis vector 𝜙𝑖 from the set of all 𝑁 basis functions.(5)Compute 𝜃𝑖=𝑄2𝑖𝑆𝑖.(6)If 𝜃𝑖>0 and 𝛼𝑖< (i.e., 𝜙𝑖 is included in the model), then reestimate 𝛼𝑖.(7)If 𝜃𝑖=0 and 𝛼𝑖=, then add 𝜙𝑖 to the model with updated 𝛼𝑖.(8)If 𝜃𝑖<0 and 𝛼𝑖<, then delete 𝜙𝑖 from the model and set 𝛼𝑖=.(9)Estimate the noise level, update 𝜎2 as follows: 𝜎2=𝑓𝐗𝐲𝑛,𝐰𝑁𝑛=12𝑁𝑀+𝑖𝛼𝑖𝚺𝑖𝑖.(17)(10)Recomputed or update 𝚺, 𝝁 and all 𝑆𝑖 and 𝑄𝑖 using (7), (13) to  (15).(11)If converged then terminate, otherwise go to Step 4.

It has to be noticed that the RVM modelling algorithm is only a multi-input but single-output modelling method. Therefore, individual model corresponding to each output needed to be constructed. A multi-input/multioutput model is then easily be obtained by combining all the individual models.

3. Experimental Setup

Sample data sets are required for RVM training and are generally collected through experiments. In this study, the experiments were conducted on a naturally aspirated, water-cooled, 4-cylinder, direct-injection diesel engine. The specifications of the engine are shown in Table 1.

The engine was connected to an eddy-current dynamometer, and a control system was used for adjusting its speed and torque. Ultralow sulfur diesel fuel containing less than 10-ppm-wt sulfur was adopted in the test. The experimental setup is illustrated in Figure 2.

The experiments were carried out at engine speeds of 1200, 1400, 1600, 1800, and 2000 rpm and each at engine loads of 28, 70, 140, 210, and 252 Nm. For each test, the volumetric flow rate of fuel was measured using a measuring cylinder and then converted into mass consumption rate, which is used to calculate the brake-specific fuel consumption (BSFC) and the brake thermal efficiency (BTE). The gaseous species in the engine exhaust including CO, CO2, and NOx, were measured on a continuous basis using the Anapol EU5000 exhaust gas analyzer which was suitable for measuring diesel engine emissions. The Anapol EU5000 used infrasensors for measuring CO and CO2 concentrations and used chemical cells for measuring NO and NO2 to obtain the NOx concentration. The gas analyzer was calibrated with standard and zero gases before each experiment. Particulate mass concentration was measured with a tapered element oscillating microbalance (TEOM, Series 1105, Rupprecht & Patashnick Co., Inc.). The exhaust gas from the engine was diluted before passing through the TEOM with a Dekati minidiluter. The dilution ratio (DR) was evaluated based on the following equation:DR=CO2exhaustCO2backgroundCO2dilutedCO2background,(18)

where [CO2]exhaust, [CO2]diluted, and [CO2]background represent the undiluted, the diluted, and the background CO2 concentrations, respectively. The dilution ratio was around 8 in the tests.

At each speed and load, data were recorded after the engine had reached the steady state, which was indicated by the lubricating oil temperature and the coolant temperature. For the purpose of reducing experimental uncertainties and ensuring repeatability of test data, the data were recorded continuously for 5 minutes to reduce experimental uncertainties, and each test was carried out three times. The average values were used in this research.

Based on the measured data, the following parameters are derived:

Brake thermal efficiency: BTE=𝜂𝑏=𝑃𝑏̇𝑚𝑓,𝐿𝐻𝑉(19)

Brake specific fuel consumption: BSFC=̇𝑚𝑓𝑃𝑏,(20)

where 𝑃𝑏 is the brake power calculated from the measured torque and engine speed, ̇𝑚𝑓 is the mass flow rate of the diesel fuel and LHV is the lower heating value of the diesel fuel.

4. Application of RVM and Modelling Results

To evaluate the effectiveness of RVM, the prediction models were built based on the experimental data. As the collection of the experimental data is time consuming and costly, only 22 data sets corresponding to different load and speed settings were collected from the experiments, which are shown in Table 2. 18 sets of them were used as the training data for the model construction, and the rest 4 sets were used for model validation and testing. Actually, several weeks were required to collect the twenty-two data sets professionally. Table 3 illustrates the use of each of the data sets.

The measured parameters in each of the data sets can be basically separated into two categories, which are the input parameters and output parameters. Engine speed and engine load are the two most important independent parameters that affect the engine performance and emissions. They are included in the input parameters. The coolant controls the engine temperature so the coolant temperature is regarded as an important factor and is also treated as the input parameter. The brake-specific fuel consumption and the brake thermal efficiency represent the engine performance; thus, they are used as the output parameters. Moreover, the NOx and particulate matter are two most serious exhaust emissions from diesel engine. Therefore, the output parameters also consist of the NOx concentration and particle mass concentration.

The RVM modelling was implemented using MATLAB. There are three input parameters and four output parameters, indicating that four individual RVM models have to be built. Moreover, in order to have a more accurate modelling result and to prevent any input parameter from dominating the output value, the input data is conventionally normalized before training [21]. In this study, all the input values were normalized within the range [1,1].

To verify the accuracy of the RVM model, the predicted output values is compared with the actual values from the test data sets and shown in Figures 3, 4, 5, and 6.

The corresponding prediction errors are presented by the mean absolute percentage error (MAPE); they were evaluated against the experimental data sets using (21). Moreover, the fraction of variance (𝑅-squared value) is also calculated using (22) and (23). The smaller the MAPE, the better the modelling accuracy is. In addition, the best fitness of 𝑅2 is 1MAPE=1𝑁𝑡𝑁𝑡𝑘=1||||𝑦𝑘𝐗𝑓𝐤𝑦𝑘||||×100%,(21)𝑅2=1𝑁𝑡𝑘=1𝑦𝑘𝐗𝑓𝐤2𝑁𝑡𝑘=1𝑦𝑘𝑦2,(22)1𝑦=𝑁𝑡𝑁𝑡𝑘=1𝑦𝑘,(23)

where 𝐗𝐤 is the 𝑘th input vectors for the prediction, 𝑓(𝐗𝐤) is the prediction value corresponding to 𝐗𝐤, 𝑦𝑘 is the actual value corresponding to 𝐗𝐤, 𝑦 is the mean of the actual value, and 𝑁𝑡 is the number of test data points.

Table 4 summarizes the training MAPE, the MAPE over the test data sets, and the fraction of variance for each output parameter of the RVM model.

5. Comparison of RVM and ANN Modelling Results

To illustrate the advantages and superiority of the proposed RVM model, the prediction result was compared with a multilayer feed-forward neural network with backpropagation. Since multilayer feed-forward neural network is a well-known universal estimator [22] and many researches for diesel engine performance modelling [511, 13] were done based on this configuration, the results from it can be considered as a rather standard benchmark.

A neural network with one hidden layer was built based on the same training data sets used for RVM modelling. The neural network consists of 3 input neurons, 20 hidden neurons, and 4 output neurons. In fact, the number of hidden nodes was determined by a trial and error analysis, varying the number of hidden neurons between 3 and 30, this burden demonstrates the ineffectiveness of the ANN approach.

The activation function used inside the hidden layer was the Tan-Sigmoid transfer function, while a pure linear filter was employed for the output layer. Levenberg-Marquardt algorithm was used as the training algorithm. The learning rate of the weight update was set to be 0.05. Figure 7 depicts the architecture of the neural network.

The same test sets were also chosen so that the RVM and ANN model can be compared reasonably. The prediction accuracy of each output in the ANN model is illustrated in Figures 8, 9, 10, and 11 and Table 5.

Tables 4 and 5 show that the RVM outperforms the ANN by about 36.45% in terms of average MAPE under the same test sets. The relatively high training MAPE of the ANN shows that the data sets is not sufficient for building such a highly nonlinear model. Furthermore, only one initial value 𝜎 of the basis width is required by RVM, while the learning rate, number of hidden layers, and number of hidden neurons are required in ANN, which means a grid of guessed values for these parameters have to be prepared.

The MAPEs of both RVM and ANN for predicting the mass concentration of particulates are relatively large as compared to the other output parameter. This is because the variation of the mass concentration ranges from 0 to 12 × 104μg/m3. Actually, the RVM model tries to fit a function for the whole range rather than focusing on the low end of range, which is seen by the R-squared value of 0.97. In contrast, the ANN model tends to concentrate at the low end of the value. As a result, the R-squared value for the ANN model is only 0.02, which is unacceptable.

Overall, the prediction accuracy of RVM with a small amount of training data is satisfactory.

6. Conclusions

This research is the first attempt at applying RVM to model the diesel engine performance and emission characteristics of NO𝑥 and particulate matter under the condition of rare data. Although the combustion process of the diesel engine is unknown, the RVM model has successfully demonstrated the relation between the controllable factors, which are the engine speeds, engine loads, and coolant temperature, and the output variables, including the brake-specific fuel consumption, brake thermal efficiency, NO𝑥 emission, and particulate mass concentrations. Experimental results show that the RVM model is still acceptable even if the data sets are few. It is believed that more data sets can improve the accuracy of the model.

Furthermore, the RVM model is also compared with an ANN model. The results indicate that the average accuracy of the RVM model is higher than that of the ANN model by about 36.45%, implying that RVM is superior to ANN.

With the proposed RVM model, experimental efforts can be reduced significantly as the performance and emissions of the diesel engine can be predicted easily. By applying this RVM model as a virtual sensor on diesel vehicles, the exhaust emissions can be controlled more effectively by incorporating with some advanced control algorithms, such as model predictive control. The study of model predictive diesel emission control based on RVM model will be considered as a future work. Since RVM can also perform online model update, the applications of RVM to online system modelling and online control will also be explored in the future.

Acknowledgments

The research is supported by the University of Macau Research Grant, Grant no. MYRG149(Y2-L2)-FST11-WPK and the short-term visiting scholar programme of University of Macau. The authors would also like to thank the support from the Hong Kong Polytechnic University and the technician, Mr. Wong, Hang Cheong, of the Automotive Engineering Laboratory of University of Macau.