Abstract

For system failure prediction, automatically modeling from historical failure dataset is one of the challenges in practical engineering fields. In this paper, an effective algorithm is proposed to build the failure prediction Bayesian network (FPBN) model with data mining technology. First, the conception of FPBN is introduced to describe the state of components and system and the cause-effect relationships among them. The types of network nodes, the directions of network edges, and the conditional probability distributions (CPDs) of nodes in FPBN are discussed in detail. According to the characteristics of nodes and edges in FPBN, a divide-and-conquer principle based algorithm (FPBN-DC) is introduced to build the best FPBN network structures of different types of nodes separately. Then, the CPDs of nodes in FPBN are calculated by the maximum likelihood estimation method based on the built network. Finally, a simulation study of a helicopter convertor model is carried out to demonstrate the application of FPBN-DC. According to the simulations results, the FPBN-DC algorithm can get better fitness value with the lower number of iterations, which verified its effectiveness and efficiency compared with traditional algorithm.

1. Introduction

With the developments of information and computer technologies, modern systems have become more complex while the relationships among systems have also become more complicated. To fulfil the high demands of system safety, operational efficiency, and life cycle cost, the key objective is to predict the system state and warn of potential failures with the help of advanced methods, which could avoid great loss before failure happens [1].

Failure prediction approaches have been divided into three types, including experience-based, condition-based, and model-based methods [2]. Li et al. [3] performed a reliability analysis with an emphasis on predicting the lifetime of diesel engine’s turbocharger, in which the failure mode and the information of criticality are fully utilized. Wang and Jiang [4] evaluated the degradation of complex system performance using complex system condition monitoring information based on support vector machine (SVM). Zhang et al. [5] proposed a particle swarm optimization based SVM model with the characteristics in software reliability prediction. Although many interesting methods have been proposed for failure prediction, the model-based method has played more important role in engineering fields for its advantages in effectiveness and efficiency.

Data mining, also referred to as knowledge discovery, is defined as the process of extracting nontrivial, implicit, previously unknown, and potentially useful information from databases [6]. With the wide application of maintenance information management systems, operation data can be collected easily. So, many scientists and engineers applied artificial intelligence or statistical methods to establish the failure prediction models. For instance, Chen et al. [7] proposed the manufacturing defect detection method using the technique of associating rule mining. Han et al. [8] used sequential association rules mining to extract the failure patterns and forecast failure sequences of Republic of Korea Air Force aircraft with various combination states of aircraft types, location, mission, and season. Dong [9] described the concepts, models, algorithms, and applications of hidden Markov models and hidden semi-Markov models in engineering asset health prognosis.

Bayesian network (BN) is a directed acyclic graph which can represent uncertain knowledge by describing relationships and influences among variables [10]. Built upon the Bayes’ theorem, BN is designed to obtain posterior probabilities of unknown variables from known probabilistic relationships. Moreover, with the help of graphical diagrams consisting of nodes and edges, BN can be understood more easily than many other techniques. So, BN has got great popularity for solving system modelling problems in broad engineering fields owing to its advantages [11, 12]. Particularly for system reliability prediction, Muller et al. [13] formulated a dynamic prognosis BN model with the knowledge from functional and dynamic modelling. Langseth and Portinale [14] proposed a BN modelling framework which could translate standard fault tree to BN. Doguc and Ramirez-Marquez [15] studied a BN construction method for system reliability estimation and provided a step-by-step illustration of the method. Weber and Jouffe [16] developed the dynamic object-oriented BN by integrating system functioning and malfunctioning knowledge. Mahadevan et al. [17] applied BN to structural system reliability reassessment and validated it by analytical comparison.

Generally, it is not easy to build and quantify BN’s relationship for practical cases only based on expert opinions, especially for uncertain reliability prediction problems. Because system operation data are abundant in quantity and various in characteristics, this paper introduce an expanded BN model to describe the failure prediction process for complex system under uncertainty and proposes a divide-and-conquer principle based data mining algorithm to build the corresponding model.

The rest of this paper is organized as follows. Section 2 describes the failure prediction Bayesian network (FPBN), including nodes types, edges directions, and conditional probability distributions (CPDs). In order to facilitate the FPBN modeling with failure data, a divide-and-conquer principle based modeling method is proposed in Section 3. With the helicopter convertor case, Section 4 illustrates the application of the proposed FPBN modeling method. Section 5 concludes this study and gives several possible future research topics.

2. Failure Prediction Bayesian Network

By inheriting the advantages of BN, the FPBN is introduced to describe the state of components and system and the cause-effect relationships among them for system failure prediction [18]. A FPBN is also described with , where represents nodes, represents edges, and represents CPDs. However, some practical assumptions are built on nodes and edges in FPBN according to the characteristics of failure prediction tasks. So, the FPBN can perform more efficiently than traditional BN in the field of system failure prediction. A simple example is shown in Figure 1.

2.1. Types of Nodes in FPBN

In BN, all the nodes represent variables which have equal status. In FPBN, nodes reflect the state of component or system in practical engineering systems. According to corresponding role in system failure prediction process, the node set is divided into three types of subsets as , including failure cause subset , failure mode subset , and failure detection subset . As BN, the values of all nodes in FPBN are discrete and mutual exclusive. For binary systems, each node has two states, functioning as 0 and failure as 1. For multistate systems, there are more failure states within a node which will be represented as .

2.1.1. Failure Cause Nodes

This type of nodes describes the root causes of certain failure mode. In the failure prediction process, the possible states of a failure cause node could be derived from detected information, reliability estimation, or expert initialization.

2.1.2. Failure Mode Nodes

The failure mode nodes represent the operation states of system which is the final object of failure prediction task. There will be usually only one failure mode node which is the objective of the failure prediction.

2.1.3. Failure Detection Nodes

Failure detection nodes describe the detectable states of certain sensors or lights or alarms. They are affected by the failure cause nodes or failure mode node in FPBN model.

2.2. Directions of Edges in FPBN

When dealing with the system failure prediction problem in practice, maintenance engineers usually use the failure detection information to diagnose the possible state of corresponding failure cause and integrate the failure cause states to estimate the probability of failure mode.

In traditional BN, an edge represents the relationship between any two nodes, while in FPBN, each edge in indicates that there are cause-effect relationships between and . is the cause of and is the effect of . In particular, the directions of edges between different types of node subsets are initialized. As shown in Figure 1, the failure mode nodes VL could only be affected by the failure cause nodes (HP, HV); the failure cause nodes (HP, HV) and failure mode nodes VL could be revealed by the failure detection nodes HT. But among the same node subset, such as (HP, HV), there is no restriction on the direction of edge between them. Such edges can only be determined by operation dataset or expert knowledge. The directions of edges for different node subsets in FPBN are consistent with the reasoning process of failure prediction tasks. The practical FPBN is easy to understand by maintenance engineers.

2.3. Conditional Probability Distributions in FPBN

The FPBN has the same meaning of with traditional BN. The represents the CPD of each node which expresses the intensity of relevance among and its father nodes . For the root nodes which do not have father node, their CPDs are replaced with corresponding prior probability distributions.

Like a BN, the node is independent of other nodes in FPBN if the states of its all father nodes are known. When actual states of failure detection nodes are inputted, the FPBN model is operated with CPDs to estimate the state of failure cause nodes which will determine the state of failure mode node.

3. Modeling of FPBN with Divide-and-Conquer Principle

3.1. BN Modeling Method Based on Data Mining

Since building objective BN model with expert experience is not easy, learning practical model from dataset with data mining methods has attracted considerable attention recently [19]. The BN modeling process usually consists of two parts: learning the BN structure which is represented with nodes and edges and learning the BN parameters which specify the CPDs of BN.

The key problem of learning BN structure from dataset is to find the most proper network structure which could represent the potential relationships in the dataset accurately. Because learning the BN structure from dataset is an NP-hard problem for large networks, the conditional independence tests based algorithms and the score and search based algorithms have been proposed separately to settle this challenge [20]. The former method discovers the potential conditional independence relationships of nodes from dataset with conditional independence test equation and builds BN based on such relationships [21]. In the score and search based methods, a score function is used as the criterion to represent how the candidate network structure fits the dataset while a searching algorithm is applied to find the best structure with the highest score in all candidate network structures. Some popular score functions include Copper-Herskovits function [22], Bayesian information criterion (BIC) function [23], and minimum description length [24]. In the searching algorithms, the BN structure is usually encoded as an ordered string or a connection matrix while different operators have been designed and employed to find the one with the highest scores. Some common algorithms include genetic algorithm [25], evolutionary programming [26], ant colony optimization [27], integer linear programming [28], globally parallel learning [29], and heuristic equivalent learning [30].

Such algorithms are mainly proposed for general BN structure learning where no restriction is applied on the directions of edges. There is also another kind of BN structure learning where the sequence of all nodes is known and the latter node could only be the child node of former node. The famous K2 algorithm could deal with this kind of BN structure learning problem well with deterministic searching [22]. The FPBN is actually a new kind of BN structure learning where the sequence of all node subsets is known and the node in latter subset could only be the child node of the node in a former subset. The K2 algorithm cannot be applied to FPBN structure learning for its comprehensive node sequence restrictions. The general BN structure learning algorithms usually cost lots of time when the node number is large and the edges between nodes are complex. So, the characteristics of nodes and edges in FPBN should be considered to decrease the number of possible candidate network structures and limit the searching space.

3.2. Structure Learning of FPBN with Divide-and-Conquer Principle

In computer science, divide-and-conquer principle is an important algorithm based on multibranched recursion [31]. The divide-and-conquer algorithm breaks down a problem into two or more subproblems of the same type which are simple enough to be solved directly. The solutions to the subproblems are then combined to give a solution to the original problem. The correctness of a divide-and-conquer algorithm can be proved by mathematical induction, and its computational cost is often determined by solving recurrence relations.

According to the directions of edges in FPBN, it is clear that the father nodes of failure detection nodes may belong to failure cause nodes and failure mode nodes; the father nodes of the failure mode nodes could only be failure cause nodes; for a node in the failure cause node subset, every node except itself could be its father node. With these restrictions, a divide-and-conquer principle based algorithm (FPBN-DC) is introduced to learn the FPBN network structure. It will build the network structures for failure detection, failure mode, and failure cause nodes separately. The modeling process of FPBN-DC algorithm is listed as follows.

Step 1. Initialize node set in FPBN and classify the nodes in into three subsets according to the type of nodes. Three subsets are ordered in ascending which means the node in latter subset could only be the child node of node in former subset.

Step 2. Choose the score function to evaluate candidate FPBN network structure. The BIC score [23] is used as the score function, as (1). Because the BIC function of whole network could be decomposed as the sum of each node’s single function, (1) can be transferred to (2): where represents the number of nodes in FPBN; represents the number of candidate combination states of the father nodes of the th node; represents the number of candidate states of the th node; represents the number of failure records which satisfy the request that the th node is in the th state and its father node set is in the th state; represents the number of failure records which matches that the father node set of the th nodes is in the th state; represents the whole number of all the failure data records.

To search the best network structure with the highest , it is clear that every part of should get its score as high as possible. Since FPBN is an extension from BN, it still has to satisfy the request that there is no loop in the network structure. So, the key point is to decompose logically where there is no loop in the corresponding network structure.

Theorem 1. If there is no loop inside the interior structure of each of node subsets ,, in FPBN, then no loop exists in the whole FPBN model.

Proof. If there is a loop in the FPBN model with no loop inside three subsets , then there must be at least one edge point from one of the subsets ,, with a second edge pointing back to this subset. But according to the FPBN edge’ directions, the edge could only point to the latter subset and cannot point back to the former subset. So, such a loop in the FPBN model does not exist.

Lemma 2. The maximal BIC score of the FPBN model could be broken up to the sum of the maximal BIC score of the three subsets, as .

Proof. According to Theorem 1, there will be no loop between the subsets. When every subset has the highest score with no loop inside the subset, the whole FPBN structure satisfies the limitation of no loop and the score of the structure is the highest.

With Lemma 2, the FPBN structure searching problem for the highest score is divided into three small scale structure searching problems.

Step 3. Select a node which belongs to subset randomly and remove it from . Its candidate father node set is .

Theorem 3. Adding a father node to the node which belongs to failure detection subset will not form a loop inside the subset.

Proof. Because , adding a father node to any node belonging to subset will not connect any two nodes inside the subset with an edge. This means that there will be no relationship between failure detection nodes.

Lemma 4. The maximal BIC score of subset could be broken up to the sum of the maximal BIC score of each node inside the subset, as .

Proof. According to Theorem 3, there will not be any loop between the nodes inside the subset . When each node has the highest score, the subset structure satisfies the limitation of no loop and the score of the subset is the highest.

Lemma 4 could reduce the searching complexity by just calculating the highest score of every single node inside the subset .

Step 4. With the selected node , for every node in candidate father node set , compute the updated structure score supposing that the node is added to the actual father node set of . According to Theorem 3, it does not need to verify the structure when a node is added to node ’s father node set.

Step 5. Select the node in the set that leads to the highest score of and name this score as . If the score is higher than the old structure score , move this node from the set to the actual father node set of and update the score as . Turn to Step 4 and search other father nodes of in the rest candidate father node set . If , it means there is no possible father node in set which could lead to a higher score. Turn to Step 6.

Step 6. Check whether there is still a node in set which belongs to failure detection subset . If yes, turn to Step 3 and search its maximal score. If no, it means the highest score of every node in subset has been found. Turn to Step 7 to search the maximal score of the failure mode node.

Step 7. According to the FPBN description, there will be usually only one failure mode node in subset . Select the node and remove it from . Its candidate father node set is .

Theorem 5. Adding a father node to the node which belongs to failure mode subset will not form a loop inside the subset.

Proof. Because there is only one node in the failure mode subset and its candidate father nodes belong to failure cause subset, adding a father node to the node belonging to subset will not connect the node itself with an edge. This means that there will be no loop in the subset .

Step 8. With the selected node , for every node in its candidate father node set , compute the updated structure score supposing that the node is added to the actual father node set of . According to Theorem 5, it does not need to verify the structure when a node is added to node ’s father node set.

Step 9. Select the node in set which leads to the highest score of and name this score as . If the score is higher than the old structure score , move this node from the set to the actual father node set of and update the score as . Turn to Step 8 and search other father nodes of in the rest candidate father node set . If , it means there is no possible father node in set which could lead to a higher score. Turn to Step 10.

Step 10. The father nodes of a failure cause node in subset could be any other nodes in , as . So, the problem transfers to how to learn a general BN structure inside the subset with the highest score of . An immune algorithm based structure learning method for BN (BN-IA) [32] is applied to deal with this problem.

Step 11. All maximal scores of three subsets have been calculated and the score of is the sum of them.

Using FPBN-DC algorithm, it is clear that the original nodes of FPBN structure learning problem are broken down into three BN structure learning problems with a fewer number of nodes. Then these 3 smaller scale searching problems can be solved with general BN structure learning methods easily.

3.3. Parameter Learning of FPBN

For the BN, the parameter learning is to find the to maximize objective likelihood function when the best network structure is learned from the dataset . The calculation of is a parameter estimation problem in statistics field and is usually solved by the maximum likelihood estimation (MLE) method [33].

In the MLE based BN parameter learning method, the CPDs of nodes are . By counting the state distributions of each node under every state combination of all its father nodes from dataset, the MLE method can find the best probability distributions for all nodes. Each parameter in is calculated as (3) and its practical meaning is shown in (4). When all in reach , the objective likelihood function will have the largest value:

Because the CPDs in FPBN are the same with CPDs in BN, the parameter learning of FPBN also used the effective MLE method.

4. Simulation Study

4.1. Simulation Dataset

For the simulation study, we introduced a practical helicopter convertor FPBN model [34] as the original model. The nodes in the helicopter convertor FPBN model belong to three subsets. The failure cause subset includes “Power part,” “Voltage adjustor,” “Transformation filter,” “Output filter,” and “Fan.” The failure mode node is “No output” and the failure detection nodes are “Voltage output,” “Filter output,” and “Fan sound.” The details of these nodes are shown in Table 1.

The cause-effect relationships among nodes are shown in Figure 2. The node “Power part” affects node “Voltage adjustor” while the nodes “Voltage adjustor,” “Transformation filter,” and “Output filter” result in the convertor failure of  “No output.” The node “Voltage output” is an outer representation of “Power part” and the node “Filter output” is also an outer representation of “Transformation filter.” The node “Fan sound” is the result of both “Output filter” and “Fan.”

According to this practical model, 3000, 5000, and 7000 operation records are generated from it separately with a random sampling method. Each record represents the corresponding states of all variables in helicopter convertor at a time. These different scales of failure record datasets are named as dataset 1, dataset 2, and dataset 3 to demonstrate the application of FPBN-DC and verify its performance independently.

4.2. Simulation Results

To verify the effectiveness and efficiency of the proposed FPBN-DC algorithm, the BN-IA algorithm, which ignores the assumption of node types and edge directions in FPBN, is also introduced to learn the network structure from datasets.

First of all, we discuss the coding scale of network structure which determines the searching space of each algorithm. For both FPBN-DC and BN-IA, an adjacency matrix is used to describe the network structure, as shown in Table 2. In BN-IA, the number of bits for structure code is 72 because there are 9 nodes in the model and each node needs 8 bits to represent its father nodes. In FPBN-DC, the structure code has only 20 bits because the structure learning scale is reduced according to Lemma 2. Actually, we just need to learn the network structure of nodes in failure cause subset. Generally, the algorithm searching time is mainly consumed in the score calculation process for the score and search based algorithms. Because BIC score has to go through the dataset times to count the and according to (1), the searching time deeply depends on the parameter of and . Since represents the number of all candidate combination states of the father nodes of the th node, it relates to the max number of a node’s father nodes directly. It is obvious that the FPBN-DC algorithm has smaller searching space and less searching time.

Then, two algorithms learn the 3 generated datasets 10 times separately with same parameters. The highest and average fitness values (an equivalent of the BIC score) of each algorithm for every dataset are listed in Table 3. The convergence iterations of the corresponding fitness values are also listed in it. Since the fitness value represents the similarity between the network structure and corresponding dataset, the highest fitness of 10 runs is the main criterion for the effectiveness of algorithms. The corresponding convergence iteration of the highest fitness is a reasonable criterion for algorithm efficiency.

According to the highest fitness (in bold type) of each algorithm and each dataset during the 10 runs in Table 3, the FPBN-DC and BN-IA algorithms reach the same highest fitness values for all the 3 datasets. However, the corresponding iterations needed to reach the highest fitness in FPBN-DC are much less than the iterations in BN-IA. Furthermore, the average fitness values of 10 runs in 3 datasets equal the highest fitness values in 10 runs for FPBN-DC. This means that the FPBN-DC can get the best network structure in every searching run. For BN-IA, the average fitness values of 10 runs in 3 datasets are less than the highest fitness values, which represent the stochastic error of this algorithm.

The objective of the FPBN structure learning is to search the best network which could represent the dataset comprehensively. So, comparing the best network of each algorithm with the original one is another useful criterion for algorithm effectiveness. The edge differences between the best network of each algorithm in 10 runs and original FPBN in Figure 2 are listed in Table 4, where represents the number of added edges, represents the number of lacked edges, and represents the number of reverse edges. The best network structures learned by FPBN-DC and BN-IA algorithm are exactly the same as the original one in 3 datasets. This result also shows the ability of FPBN-DC in data retrieval.

Finally, according to the comparison results, the FPBN-DC algorithm can get higher fitness value with a lower number of iterations, which verified its effectiveness and efficiency compared with BN-IA algorithm.

5. Conclusion

The paper proposed an effective algorithm to build the FPBN model from system operation dataset. The types of network nodes, the directions of network edges, and the CPDs of nodes in FPBN are discussed in detail at first. Then, the FPBN-DC algorithm is introduced into the FPBN modeling process to learn the network structure of failure detection, failure mode, and failure cause nodes separately according to their assumptions on edge directions. Finally, the simulation study of a helicopter convertor FPBN model is carried out. The proposed FPBN-DC and the BN-IA algorithms learn the same 3 generated datasets 10 times separately with same parameters. Taking the advantages of divide-and-conquer principle, the PFBN-DC has a smaller coding scale than BN-IA which means smaller searching space and less searching time. The comparison results also show that FPBN-DC can get higher fitness value with the lower number of iterations. The learned network structures by PFBN-DC in 3 datasets are exactly the same as the original one which also verified its effectiveness. For the future research, with the application of sensors in practical engineering system, we plan to introduce the real-time detection node into FPBN model which may provide a more precise failure prediction.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors gratefully acknowledge the financial support for this research from the National Natural Science Foundation of China (Grant nos. 71101116 and 71271170), the Aeronautical Science Foundations of China (Grant no. 2011ZC53027), and the Basic Research Foundation of NPU (Grant no. JC20120228).