Abstract

Cost control is becoming increasingly important in hospital management. Hospital operating rooms have high resource consumption because they are a major part of a hospital. Thus, the optimal use of operating rooms can lead to high resource savings. However, because of the uncertainty of the operation procedures, it is difficult to arrange for the use of operating rooms in advance. In general, the durations of both surgery and anesthesia emergence determine the time requirements of operating rooms, and these durations are difficult to predict. In this study, we used an artificial neural network to construct a surgery and anesthesia emergence duration-prediction system. We propose an intelligent data preprocessing algorithm to balance and enhance the training dataset automatically. The experimental results indicate that the prediction accuracies of the proposed serial prediction systems are acceptable in comparison to separate systems.

1. Introduction

In recent years, cost savings have become critical in hospitals [13]. In addition to staff salaries, hospitals have to pay for equipment, materials, and administrative expenses. A conservative estimate of operating room utilization costs exceeded $15 per minute in 2012 [1]. However, in 2017, the estimate increased to $36 per minute [2]. The cost has always been an important optimization objective in operating room scheduling [3]. Therefore, cost control is a key factor for successful hospital management.

The operating room is a core component of the hospital, and its use contributes considerably to hospital costs [4]. In general, surgeries require expensive resources, such as equipment, materials, energy, and medical staff [5]. In some hospitals, several operating rooms form a laminar flow operation center, in which the energy cost of using five operating rooms for one hour is almost the same as that of using one operating room for one hour. However, the energy cost of using one operating room for five hours is five times that of using five operating rooms for one hour. If the operating rooms are not optimally used, the cost is high. Hence, detailed operating room scheduling is required to ensure that all required resources are available at the right time.

Therefore, a well-arranged operating room schedule can reduce costs. However, such a schedule is difficult to achieve, primarily because of the uncertainty in the operating room use duration [6], which is determined by surgery and anesthesia emergence durations. Surgery duration is defined as the time from the beginning to the end of surgery. Anesthesia emergence duration is the time from the end of the surgery to the time when the patient wakes up. Anesthesia emergence is relevant only in surgeries under general anesthesia. If the operating room schedule does not allocate sufficient time for an operation, the next operation cannot start on time. However, if the planned operation duration is longer than the actual duration, the operating room will be vacant, leading to a waste of resources. In extreme cases, operating rooms may remain open beyond their planned working hours, incurring costly overtime wages and energy consumption [7]. Similarly, an inaccurate estimate of the duration of anesthesia emergence can lead to poor surgical scheduling, resulting in resources wastage. In contrast, optimal scheduling can reduce the resource consumption that is caused by the difference between the estimated and the actual operating room use times.

Previous studies have used several approaches to estimate surgery duration [817]. A common method is to assume the decisions of surgeons as the deciding factor. Surgeons make a rough estimate of the operating room use times based on the average duration of previous similar operations (previous experience), the type of surgery, patient characteristics, and other factors. [8]. They tend to avoid risks and have limited ability to estimate the duration of surgery [9]. Using this method, the case duration was overestimated by up to 32% or underestimated by 42% [10]. The second common approach is to use electronic health records (EHRs) to calculate a given case duration based on historical data [11]. Because EHR does not take into consideration factors, such as body mass index, anesthesia type, staff, and so on [12], its accuracy is modestly higher for ordinary patients [11] and greatly dependent on other factors. Another method is to simulate the case duration according to the concept of probability distribution. Commonly used probability distributions include the hypergamma, lognormal, gamma, and Weibull distributions [13]. This approach has often been used to describe the stochastic duration of surgery, however, its accuracy has not been reported in the literature. With the popularization of artificial intelligence, numerous statistical and machine-learning tools have been used to predict surgery durations. These methods include Bayesian methods [14], regression techniques [1416], neural networks [1416], and random forests [17]. For example, Devi et al. [14] established neural networks and regression models for three types of ophthalmic surgery studies (cataract, corneal transplant, and oculoplastic surgery). The predicted root mean square error was affected by the number of hidden layer neurons constructed by the model. Because of the different types of surgeries, the RMSE was 0.0656–0.6295. Based on the findings of reference [14], we conducted more in-depth experiments on the decision-making of prediction model architecture, including the number of hidden layers and the number of neurons in each hidden layer.

Bartek et al. [15] used linear regression and supervised machine learning to create surgeon-specific models (92 individual models) and service-specific models (12 service-specific models). The prediction accuracies of the former were 32% to 39% and better than those of the latter. Shahabikargar et al. [17] employed the filtered random forest algorithm to predict the surgical duration. They filtered the data of 60,362 elective surgeries in two hospitals by deleting missing, inconsistent, and duplicate values before modeling. The overall prediction error decreased by 44% (mean absolute percentage error from 0.68 to 0.38%) compared to the error without data preprocessing. Therefore, for data preprocessing, we referred to the data cleaning operation in reference [17] and proposed an intelligent data preprocessing algorithm to improve the data quality.

Tuwatananurak et al. [16] employed supervised learning to learn 990 effective surgical data within three months. The results showed a 7 min improvement in absolute difference between the predicted and actual case durations when compared to that obtained using conventional EHR. The method also resulted in a 70% reduction in overall scheduling inaccuracy. As hospitals seek more economical process arrangements, the patient's anesthesia recovery is being moved from the operating room to the postanesthesia care unit, resulting in the separation of anesthesia recovery time from operation time. Thus, these two must be predicted separately. To the best of our knowledge, no study has been reported on the prediction of anesthesia awakening time. Therefore, in this study, we constructed a serial artificial neural network (ANN) to predict the operation and anesthesia recovery time.

Artificial intelligence has been recently adopted to various prediction problems and has resulted in good prediction performance. For example, an ANN was used to predict the recidivism of commuted prisoners [18] and solve the ship detection problem [19]. A support vector machine (SVM) was used to classify the reclaimed wafers into good and not good categories [20]. The driver lane-keeping ability in the fog problem was solved using an association rule [21]. The ANN has been used to solve various types of prediction problems and has better accuracy than SVM in supervised learning [20, 22]. Therefore, to predict the duration of surgery and anesthesia emergence more accurately for arranging the operating rooms more efficiently and effectively, ANN was used to construct the surgery duration prediction system in our previous study [23], and we used ANN to construct the anesthesia emergence duration prediction system in this study. The anesthesia emergence duration is affected by the surgery duration: the longer the surgery duration, the longer the anesthesia emergence duration. Thus, the former is an input variable for the anesthesia emergence duration prediction system. However, the actual surgery duration is unknown before the procedure is performed. Therefore, we, firstly, constructed two prediction systems: surgery duration and anesthesia emergence duration prediction systems, and then combined them to obtain the final prediction system. We used the predicted surgery duration as the input variable for the anesthesia emergence duration prediction system. According to the experimental results, the prediction accuracy of the final prediction system was 95.52%. Besides, we developed an intelligent data preprocessing algorithm to balance and enhance the dataset for the ANN. This algorithm automatically calculates the most appropriate replication time.

The remainder of this paper is organized as follows: Section 2 introduces the ANN, perceptron, and multilayer perceptron (MLP). Section 3 describes the experiments conducted and the data preprocessing algorithm. Section 4 discusses the experimental results of the three prediction systems: surgery duration, anesthesia emergence duration, and final prediction systems. Finally, the conclusions and suggestions for future research are presented in Section 5.

2. Review of the Artificial Neural Network

2.1. Artificial Neural Network

An ANN is a complex artificial system based on mathematical models based on the function, structure, and information processing of the human brain and nervous system [24, 25]. Similar to the human brain, an ANN is a self-learning system that learns to predict outputs by performing numerous iterations. All types of ANN nodes are akin to the neurons in the human brain, each of which is used as the input of the next node, after the weighting function [26, 27]. In the learning process, the weights are updated using a systematic algorithm. To obtain better output accuracy, the ANN often performs the backpropagation learning algorithm, i.e., it uses a certain set of weights and biases to perform an iteration, calculate the error with the output and actual value, propagate backwards, and update the weights and bias by the error to ensure that after several such forward and backward propagations, the output accuracy is high [28, 29]. After the ANN is trained, new data can be classified or predicted using the received stimulus (new input data), weights, and biases.

The ANN is a powerful tool for learning and modeling complex linear or nonlinear relationships. To be more precise, the model it builds is similar to a “black box.” The nature of the relationship between the input and output data is unknown [30]. ANNs have been widely applied in a wide range of problems in multiple fields, including engineering [31, 32], biology, mathematics, and medicine, to analyze and predict various diseases [33].

2.2. Perceptron

The perceptron model was derived from the MP model established by McCulloch and Pitts [34]. By simulating the principles and processes of biological nerve cells, the MP model describes the mathematical theory and network structure of artificial neurons and proves that a single neuron can realize a logical function. The MP model contains input, output, and computation functions. The input and output are analogous to the dendrite and axon of a neuron, respectively, whereas the calculation is similar to the processing conducted in the nucleus, with each synapse being assigned a weight.

Inspired by the MP model, the perceptron model consists of two layers. The first layer, called the input layer, receives the stimulus and passes it to the last layer. In the final layer, called the output layer, all input stimuli are multiplied with their respective weights, and the perceptron adds all the weighted stimuli and bias using the summation function. Finally, the perceptron uses an activation function to simulate data processing in the brain [35]. The basic network structure of a perceptron is shown in Figure 1.

2.3. Multilayer Perceptron

To better handle nonlinear problems, Hecht-Nielsen proposed a multilayer perceptron by placing additional layers (s) of neurons between the input and output layers [35]. As shown in Figure 2, there are two fundamental components of the basic MLP structure: neurons and the links between neurons. The neurons are the processing elements, and the links are interconnections. Every link has a corresponding weight parameter or bias parameter . When a neuron receives stimuli from other neurons via links, it processes the information and produces an output. Moreover, these intermediate layers are assumed to not be disturbed by the external environment. Therefore, they are called hidden layers, and the nodes of hidden layers are called hidden nodes. Similar to the perceptron, the input neurons receive external stimuli, and the output neurons deliver the output. Using similar neuron dynamics, hidden neurons receive stimuli from neurons at the front of the network and relay the output to the neurons at the back of the network [18].

3. Experiments

3.1. Data Setting

Operational records collected between January 2019 and July 2020 from the Affiliated Hospital of Panzhihua University were used as samples in this study. The records were used only for academic purposes and were anonymized to protect privacy.

In total, 15,754 samples were collected for this study. These samples were the data of patients without hepatic and renal diseases. To eliminate potential factors that could influence the operation time, all samples from emergency surgery patients or patients admitted to the intensive care unit after surgery were excluded. Thus, records from only patients with complete case data were included. Samples with all types of anesthesia were considered to predict the duration of surgery, whereas only samples under general anesthesia were considered to predict the duration of anesthesia emergence.

To improve the prediction accuracy of the model, we collected the data available preoperatively and identified the influencing factors of surgery and anesthesia emergence durations as the input variables of the model through literature survey and physician interviews. The input and output variables of the surgery and anesthesia emergence duration prediction systems are presented in Tables 1 and 2, respectively. In Table 1, the input variables are composed of three parts: basic patient information and preoperative physiological data (A1–A18), surgeon information (A20–A23), and operation information (A19 and A24). In Table 2, the input variables are composed of three parts: basic patient information and preoperative physiological data (A1–A18), anesthesiologist information (A20–A23), and operation information (A19 and A24). By comparing these tables, we observed that the 20th and 24th input variables of the two systems were different. Firstly, as mentioned in the data setting criteria, samples with both local and general anesthesia were used in the first system, whereas only samples with general anesthesia were used in the second system. Hence, they were subdivided into four types of general anesthesia. Furthermore, the duration of surgery was dependent on the surgeon, whereas the duration of anesthesia emergence was dependent on the anesthesiologist. Moreover, the surgical grade had a profound impact on the duration of surgery but a negligible effect on the duration of anesthesia emergence. Hence, it was ignored in the latter case. In summary, the duration of surgery may affect the duration of anesthesia emergence. Therefore, the output variable of the first system was used as the 24th input variable of the second system. Given that the duration of surgery was less than 4 h, all samples with a duration of less than 4 h were used to predict the duration of surgery. Given that most durations of anesthesia emergence are less than 1 h, all samples with a duration of less than 1 h were used to predict the duration of anesthesia emergence. Therefore, the last row of Table 1 shows that the surgery duration was divided into four scales: no more than one hour, 1 h to 2 h, 2 h to 3 h, and 3 h to 4 h. Similarly, the duration of anesthesia emergence was divided into four scales: no more than 15, 15 to 40, 40 to 50, and 50 to 60 min, as shown in the last row of Table 2.

The success of an ANN depends heavily on appropriate data preprocessing. Hence, all data in this study were preprocessed using data transformation [36], inspection [37], and exclusion of outliers. After data preprocessing, 6,507 and 5,790 surgery samples were retrospectively used to predict the durations of surgery and anesthesia emergence, respectively. In addition, normalizing the data is recommended to avoid gradient explosion and eliminate the influence of data heterogeneity, which can hinder the learning process [38]. All the data normalized ranged between 0.1 and 0.9. Moreover, balancing and enriching data are essential steps in solving classification problems [39]. We propose an intelligent data-preprocessing algorithm to balance and enhance the dataset. Thereafter, we divided the dataset into training, testing, and validation datasets. The most important process of this algorithm was data balancing. The purpose of data balancing is to reduce the differences in the amount of data in each category. Two examples of data balancing are shown in Figure 3. indicates the amount of data in category , and indicates the maximum of . All categories must balance the data based on . In Figure 3, for example, represents the amount of data in Category 1 and is equal to 60. is equal to 100. Category 1 must balance data based on . The difference between the original (60) and (100) values is 40. If we increase the amount of data in categories 1 to 120, the difference between (120) and (100) decreases to 20. Therefore, in category 1, the best multiple was 2. In another example, indicates the amount of data in category 2 and is equal to 30. If we increase the amount of data in categories 2 to 90, the difference between (90) and (100) is 10. However, if the amount of data in categories 2 to 120 is increased, the difference between (120) and (100) increases to 20. Therefore, in category 2, the best multiple is 3. Based on the above concept, the rules of multiplication for data balancing are listed in Table 3. In Table 3, if , the multiple of the data balance is . The main process of the intelligent data preprocessing algorithm is illustrated in Figure 4. This algorithm requires only the normalized data, multiple to be used for enhancement, and the partition ratio of the dataset as input. Finally, we obtained the balanced, enhanced, and partitioned dataset. We generated new samples and expanded the database size by adding slight noise to the input data while maintaining the same output category. The value of the noise was between −0.03 and +0.03.

After data balancing, the data were increased by three times in the initial experiments and ten times in the final experiment. In addition, data representation was an essential part of a successful ANN [37]. In this study, the output variable was categorized according to four binary numbers, namely, 1000, 0100, 0010, and 0001, where the position of 1 indicates the category.

3.2. Computing Environment Settings

We used Python 3.7 (64 bit) as the compiler to write the program. The hardware included an Intel Core (TM) i7-10510U (2.3 GHz) CPU, 8 GB of memory, and Windows 10 Home Edition (64 bit) operating system.

3.3. Experimental Structure

The experiment was conducted in two parts. In the first part, we used MLP to construct the surgery duration prediction system and the anesthesia emergence duration prediction system. To determine the optimal architecture of both models, we conducted two sets of experiments with the same data partitioning and parameter settings. The total dataset was divided into three datasets, namely training, testing, and validation, in the respective proportions of 60, 20, and 20% [39, 40].

In the MLP, the Adam optimizer was used to adjust the weights and the cross-entropy loss function to calculate the loss of the prediction system. The batch size was set to 100, and the number of training cycles was set to 200 and 1,000 in the final experiment of the optimal architecture. The number of hidden layers and hidden nodes in each layer are critical determiners of the results [35, 39, 41]. We used a trial-and-error method to identify an optimal number of hidden layers and hidden nodes. The number of hidden layers was set to three, four, five, and six, and the number of hidden layer nodes of each layer was set to 64, 128, 256, and 512, respectively. Thus, the experimental results of the 16 parameter combinations were obtained for comparison and analysis. To reduce the stochastic effects of the experiments, we conducted ten experiments for each parameter combination. Finally, we obtained the two optimal architectures and weights used in the second part. Figures 5 and 6 show the MLP structure of the surgery duration prediction system and anesthesia emergence duration prediction system, respectively.

In the first part, we used the actual duration of surgery as the input variable of the second model. However, as mentioned in Section 3.1, the output variable of the first model was the input variable of the second model. Therefore, in the second part, we merged these two prediction systems into one. In other words, we used the predicted duration of surgery as an input variable for the second model. Thus, we intersected 6,507 surgery samples with 5,790 anesthesia emergence prediction samples and obtained 4,285 samples. The predicted surgical duration was obtained by feeding the samples into the first prediction system. The predicted duration of surgery was combined with other attributes of the samples and fed into the second prediction system to determine the duration of anesthesia emergence. Figure 7 shows the final combination prediction system.

4. Experimental Results and Analysis

4.1. Experimental Results of the Surgery Duration Prediction System

We used a trial-and-error method to identify the final architecture of the surgery duration prediction system. The experimental results are listed in Tables 411. Tables 4 and 5 present the prediction accuracy and loss value, respectively, of the surgery duration prediction system. In addition, to further explore the performance of several different MLP architectures, the experimental results were analyzed using the t-test, as shown in Tables 610. Table 11 lists the running time costs for each architecture. We determined the final architecture of the MLP based on the maximum prediction accuracy and a reasonable running time cost.

In Table 4, Mean and Std indicate the average prediction accuracy and standard deviation, respectively, over ten experiments. Max and Min indicate the maximum and minimum prediction accuracies during 10 experiments, respectively. Here, 3-64 denotes the MLP model with 3 hidden layers and 64 hidden neurons in each hidden layer. In the three hidden layer architectures in Tables 4 and 5, the 3-512 architecture had the maximum average prediction accuracy (0.7254) and the minimum loss value (0.6664) in the testing dataset. As shown in Table 6, the 3-512 architecture is significantly better than the 3-64 and 3-128 architecture. However, the 3-512 architecture is not significantly better than the 3-256 architecture. The p value is 0.4854. In other words, the 3-512 architecture and 3-256 architecture had similar prediction accuracies. In Table 11, we notice that the 3-512 architecture had a longer running time (1806.50 s) than the 3-256 architecture (584.24 s). In other words, among the three hidden layer architectures, the 3-256 architecture reduced runtime costs by 67.66% compared with the 3-512 architecture. Therefore, in the three hidden layer architectures, we chose the 3-256 architecture prediction system.

In the four, five, and six hidden layer architectures, in Tables 4 and 5, we note that the 4-256, 5-256, and 6-256 architectures had the maximum average prediction accuracies (0.7711, 0.7601, and 0.7493, respectively) and the minimum loss values (0.6551, 0.6579, and 0.6606, respectively) in the testing dataset. In Tables 7 and 8, the MLP model with 256 neurons is significantly better than those with 64 and 128 neurons. However, the MLP model with 256 neurons was not significantly better than the model with 512 neurons. The p values were 0.1291 and 0.4207, respectively. In other words, in four-and five-hidden-layer architectures, the MLP models with 256 and 512 neurons had similar prediction accuracies. However, from Table 11, we note that the 4-256 and 5-256 architectures reduced runtime costs by 69.49 and 72.23% compared with the 4-512 and 5-512 architectures, respectively. Therefore, in four and five hidden layer architectures, we chose the 4-256 and 5-256 architecture prediction systems. In Table 9, among the six hidden layer architectures, the 6-256 architecture was significantly better than the other architectures.

In all architectures, in Tables 4 and 5, the 4-256 architecture had the maximum average (Mean) prediction accuracies (0.7711, 0.8468, and 0.7714) and the minimum loss values (0.6551, 0.6364, and 0.6550) in the testing, training, and validation datasets, respectively. In addition, the 4-256 architecture had the best maximum (Max) and minimum (Min) prediction accuracy in almost all the testing, training, and verification datasets. In Table 10, the 4-256 architecture was significantly better than the 3-256 and 6-256 architectures but not significantly better than the 5-256 architecture. However, the p value (0.0710) was close to 0.05. In other words, the 4–256 architecture was significantly better than the 5-256 architecture. In addition, in Table 11, the 4-256 architecture saved 18.07% of the runtime cost compared to the 5-256 architecture. Therefore, the final architecture of the surgery duration prediction system is a 4-256 architecture.

After determining the best architecture of the surgery duration prediction system, we further improved the prediction accuracy through dropout mechanism, data enrichment, and longer training time. The experimental results are listed in Tables 1215. Tables 12 and 13 present the effect of the dropout mechanism, and Tables 14 and 15 present the impact of the data enrichment and longer training time on the 4-256 architecture, respectively. In Tables 12 and 13, we note that the 4-256 architecture without the dropout mechanism had the maximum average prediction accuracy (0.7711) and minimum loss value (0.6551) in the testing dataset. With an increase in the dropout probability, the average prediction accuracy decreases. In other words, the dropout mechanism cannot improve the prediction accuracy of the surgery duration prediction system. We enriched the data 10 times, and in Tables 14 and 15, the 4-256 architecture with 10 times the data had a better average prediction accuracy (0.8788) and loss value (0.6284) in the testing dataset. We also increased the training time to 1,000 epochs and found that the average prediction accuracy increased to 0.9485, and the average loss value decreased to 0.6110 in the testing dataset. In other words, data enrichment and increasing training time improved the prediction accuracy of surgery duration prediction system. The final architecture of the surgery duration prediction system is the 4-256 architecture without dropout mechanism, trained with 10 times the data over 1,000 epochs.

4.2. Experimental Results of the Anesthesia Emergence Duration Prediction System

We used a trial-and-error method to determine the final architecture of the anesthesia emergence duration prediction system. The experimental results are presented in Tables 1623. Tables 16 and 17 present the prediction accuracy and loss values of the model, respectively. To further explore the performance of several different MLP architectures, the experimental results were also analyzed using t-tests, as shown in Tables 1822. Table 23 lists the running time costs for each architecture. We determined the final architecture of the MLP based on the maximum prediction accuracy and a reasonable running time cost.

In the three hidden layer architectures in Tables 16 and 17, the 3-512 architecture had the maximum average prediction accuracy (0.7391) and the minimum loss value (0.6632) in the testing dataset. As shown in Table 18, the 3-512 architecture was significantly better than the 3-64 and 3-128 architectures. However, the 3-512 architecture is not significantly better than the 3-256 architecture. The p value was 0.4726. In other words, the 3-512 architecture and 3-256 architecture had similar prediction accuracies. In Table 23, we note that the 3-512 architecture have a longer running time (1928.91 s) than the 3-256 architecture (613.93 s); hence, the 3-256 architecture reduced the runtime cost by 68.17%. Therefore, we determine that the 3-256 architecture was the best architecture for the prediction system.

In the four and five hidden layer architectures, as listed in Tables 16 and 17, the 4-256 and 5-256 architectures show the maximum average prediction accuracies (0.7836 and 0.7905, respectively) and the minimum loss values (0.6520 and 0.6503, respectively) in the testing dataset. In Table 19, the 4-256 architecture is significantly better than the 4-64 and 4-128 architectures, but not significantly better than the 4-512 architecture. However, the p value (0.0892) was close to 0.05. In other words, the 4-256 architecture was significantly better than the 4-512 architecture. In Table 23, the 4-256 architecture saved 68.15% of the runtime cost compared with the 4-512 architecture. Therefore, among the four hidden layer architectures, we determine that the best architecture of the prediction system was the 4-256 architecture. In Table 20, the 5-256 architecture is significantly better than the other five hidden-layer architectures.

In the six hidden layer architectures, in Tables 16 and 17, the 6-128 architecture have the maximum average prediction accuracy (0.7454) and the minimum loss value (0.6616) in the testing dataset. As shown in Table 21, the 6-128 architecture is significantly better than the 6-64 and 6-512 architectures. However, the 6-128 architecture is not significantly better than the 6-256 architecture. The p value was 0.3675. Therefore, the 6-128 and 6-256 architectures have similar prediction accuracies. However, as shown in Table 23, the 6-256 architecture have a longer running time (1204.03 s) than the 6-128 architecture (988.21 s). Therefore, the 6-128 reduced the runtime cost by 17.93% and was the best architecture for the prediction system.

In all architectures, as listed in Tables 16 and 17, the 5-256 architecture had the maximum average (Mean) prediction accuracies (0.7905, 0.8443, and 0.7924) and minimum loss values (0.6503, 0.6370, and 0.6499) in the testing, training, and validation datasets, respectively. In addition, the 5-256 architecture had the best maximum (Max) and minimum (Min) prediction accuracy in almost all testing, training, and verification datasets. In Table 22, the 5-256 architecture was significantly better than the 3-256 and 6-128 architectures, but not significantly better than the 4-256 architecture. The 5-256 architecture and 4-256 architecture have similar prediction accuracies. Finally, in Table 23, the 4-256 architecture reduced the runtime costs by 26.90% compared to the 5-256 architecture. Therefore, we determine that the final architecture of the anesthesia emergence duration prediction system is the 4-256 architecture.

After we determined the best architecture of the anesthesia emergence duration prediction system, we further improved the prediction accuracy using the dropout mechanism, data enrichment, and longer training time. The experimental results are listed in Tables 2427. Tables 24 and 25 present the effect of the dropout mechanism, and Tables 26 and 27 present the impact of data enrichment and longer training time on the 4-256 architecture. In Tables 24 and 25, we can easily observe that the 4-256 architecture without the dropout mechanism had the maximum average prediction accuracy (0.7836) and minimum loss value (0.6520) in the testing dataset. With an increase in dropout probability, the average prediction accuracy decreases. Thus, the dropout mechanism cannot improve the prediction accuracy of the anesthesia emergence duration prediction system. We then enriched the data ten times. In Tables 26 and 27, the 4-256 architecture with 10 datasets had a better average prediction accuracy (0.8956) and loss value (0.6242) in the testing dataset. We increased the training time to 1,000 epochs, and we determined that the average prediction accuracy increased to 0.9544, and the average loss value decreased to 0.6095 in the testing dataset. Finally, the architecture of the anesthesia emergence duration prediction system is a 4-256 architecture without a dropout mechanism, trained with 10 times of data over 1,000 epochs.

4.3. Experimental Results of the Final Combination Prediction System

In this section, we discuss the results of the final system obtained by combining the surgery duration prediction system with the anesthesia emergence duration prediction system. As mentioned in Section 1, we predicted the duration of anesthesia emergence using the surgery duration. However, we could not obtain the actual duration before surgery. We then used the predicted surgery duration as the input variable for the anesthesia emergence duration prediction system. As mentioned in Section 3.3, we used 4,285 samples in the final combination prediction system. To compare the prediction accuracy of the final combination prediction system, we used 4,285 samples in the surgery duration prediction system and the anesthesia emergence duration prediction system. The experimental results are listed in Table 28.

As shown in Table 28, the prediction accuracy of the anesthesia emergence duration prediction system is 0.9645. It means that 96.45% of the anesthesia emergence duration prediction accuracy can be obtained by inputting the actual surgery duration. The prediction accuracy of the surgery duration prediction system is 0.9671. It indicates that in the final combination prediction system, the input variable is 96.71% correct. Finally, the prediction accuracy of the final combination prediction system is 0.9645. It implies that 95.52% of the anesthesia emergence duration prediction accuracy can be obtained by inputting 96.71% of the correct prediction of surgery duration time. This value (95.52%) was close to 96.45%. The difference between these two prediction systems is only 0.93% and less than 1%. Thus, the prediction accuracy of the final combination prediction system is acceptable.

5. Conclusion and Future Research

In this study, we proposed an intelligent data preprocessing algorithm that balances data automatically and used the MLP model to construct the surgery and anesthesia emergence duration prediction systems. Based on existing patient data, we identified the main attributes that affected the prediction of surgery duration and anesthesia emergence duration and then preprocessed the data accordingly. These two systems were extensively tested, compared, and analyzed to determine their implementation. By combining these two prediction systems, we were able to predict the duration of anesthesia emergence from the predicted duration of surgery. There are several interesting findings from the experimental results.

Firstly, the proposed intelligent data preprocessing algorithm has three functions: data balance, data enhancement, and dataset partitioning. In particular, it can calculate the proper multiples of each category to balance the data automatically, depending on the amount of data in each category. Using this algorithm, we did not need to calculate the multiples for each category. Therefore, our workload on data balancing was reduced, especially for a large number of categories. No similar automatic processing algorithm has been found in the limited literature research. Therefore, we thought that the proposed algorithm could be extended to the data balance of data preprocessing. Secondly, the model architecture parameters have an important impact on the prediction accuracy of the model, which is consistent with the research findings of literature [14]. In their study, they took ophthalmic surgery as the research object and found that the prediction error of the model was affected by the number of hidden layer neurons. In our research, we further expanded the exploration of the model architecture and found that 4–256 architecture is the most suitable for both prediction systems. Besides, we found that the smaller the architecture, the lower the accuracy. Conversely, the larger the architecture, the longer the running time. It also shows the importance and necessity of exploring model architecture parameters. Thirdly, overtraining did not occur. Therefore, the dropout mechanism could not improve the prediction accuracy of the two prediction systems. However, data augmentation and longer learning period improved the prediction system. These results are consistent with the theory of artificial neural network. In addition, in literature [16, 17], the influence of data quality on the prediction accuracy of the model was studied, and it was found that data quality affects the prediction accuracy of the model. In this study, we found that the quantity and quality of data are very important to the prediction accuracy of the model. Finally, before the operation is performed, it is very important to predict the surgery and anesthesia emergence duration in advance for the effective scheduling of the operation. Therefore, after combining these two prediction systems, we used the predicted surgery duration to accurately predict the anesthesia emergence duration. Based on the prediction accuracy, the combination of these two prediction systems is acceptable.

It is worth noting that the data used in this study were collected from the Affiliated Hospital of Panzhihua University in China. Therefore, one of the limitations of this study is that the experimental results of this study may only be applicable to the same type of hospitals in China. Besides, we still have many variables that were not considered in this study, such as the data for physical examination before surgery. In fact, we attempted to collect preoperative physical examination data and integrate them with the data used in this study. However, we found too many missing values in the physical examination data. We hypothesize that the main reason for this is that patients undergo different physical examinations. Therefore, the integrated data contained a large number of missing values. We also believe that patients undergoing the same type of surgery should undergo the same physical examination. Therefore, we suggest that in the future, depending on the organs that are subject to surgery, we could use specific physical examination items to predict the surgery duration and anesthesia emergence duration and obtain a more accurate prediction system. Finally, we aimed to predict the exact duration of surgery and anesthesia emergence duration. It can enable the scheduling of surgery to achieve resource use optimization, energy saving, carbon reduction, cost savings, and improve patient’s satisfaction. If we can accurately predict the actual surgery time, we will be able to optimize the surgery schedule. However, in practice, there are too many uncertain factors that will affect the prediction of surgery time, which makes it difficult to accurately predict the actual surgery time. Therefore, in our study, we converted the actual surgery time into a time interval (1 hour) to improve the accuracy of prediction. However, if the length of the time interval is too long, it will affect the optimization of surgical scheduling. Therefore, in the future, we suggest that the length of the time interval could be shortened, e.g., to 30 minutes, to improve the optimization of surgical scheduling. Besides, in the future, we also intend to use the predicted surgery and anesthesia emergence durations to further research operating room scheduling.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Disclosure

This paper is an extension of our conference paper that has previously been published [23].

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

The authors would like to thank Editage (http://www.editage.cn) for English language editing. This manuscript was funded by Research Funding Project of Panzhihua University (2020DOCO011).