Abstract

In this article, we propose fuzzy soft models for decision making in the haze pollution management. The main aims of this research are (i) to provide a haze warning system based on real-time atmospheric data and (ii) to identify the most hazardous location of the study area. PM10 is used as the severity index of the problem. The efficiency of the model is justified by the prediction accuracy ratio based on the real data from 1st January 2016 to 31st May 2016. The fuzzy soft theory is modified in order to make models more suitable for the problems. The results show that our fuzzy models improve the prediction accuracy ratio compared to the prediction based on PM10 density only. This work illustrates a fuzzy analysis that has the capability to simulate the unknown relations between a set of atmospheric and environmental parameters. The study area covers eight provinces in the northern region of Thailand, where the problem severely occurs every year during the dry season. Seven principle parameters are considered in the model, which are PM10 density, air pressure, relative humidity, wind speed, rainfall, temperature, and topography.

1. Introduction

Pollution problems are inevitably a global concern of the 21st century. Over the past decade, polluted haze has become a major problem in the northern region of Thailand and surrounded countries. In March 2019, the problem reached a crisis when the daily average PM2.5 and PM10 (particulate matter of 2.5 microns and 10 microns in diameter or smaller) density rates were well beyond the national standard of 25 μg/m3 and 50 μg/m3 for several days according to local environmental data sources such as Pollution Control Department [1], Climate Change Data Centre of Chiang Mai University [2], and Smoke Haze Integrated Research Unit [3]. This situation has occurred every year on dry season, from January to May, and generally reached its peak in March. During this period, a large amount of particulate matters are released into the atmosphere, including carbon monoxide, carbon dioxide, volatile organic compounds, and carcinogenic polycyclic aromatic hydrocarbons [4]. The main emission source is biomass open burning, such as forest fires, solid waste burning, and agricultural residue field burning [5, 6].

This problem has a significant effect on human health, local traveling industry, and the economy as a whole, especially in Chiang Mai province, a popular tourist destination.

The public health ministry of Thailand has reported an increase in bronchial asthma and respiratory diseases in people living in these areas. In addition, these fine particles contain carcinogenic polycyclic aromatic hydrocarbons that can induce lung cancer [7]. The smoke haze episodes also reduce visibility and cause a variety of environmental effects which eventually leads to decline in various economic sectors such as tourism, transportation, and agriculture. Thai government has launched various policies to get the smoke haze problem under control. However, the problem still continues to grow, even with the enforcement of outdoor burning ban issued by Thai government during February to April period.

Apparently, the atmospheric parameters and topography play the key parts of the problem. The air pollutants are trapped near ground level due to the meteorological conditions (e.g., stagnant air), and the basin-like topography surrounded by high mountain ranges results in restricted pollution dispersion. Moreover, low rainfall in dry season also adds on to the severity of the haze problem. For this reason, the leaching of smoke or dust particles in the air is low [6]. These conditions caused the air pollutants to flow out difficultly and the particle cannot be easily escaped from the area. Notably, there are some technologies that mitigate the pollution problem. However, the costs of devices are considerably expensive.

Undoubtedly, an efficient warning system would become a major help in the haze problem management. The system will significantly improve public safety and mitigate damage caused. The Goddard Earth Observing System Model Version 5 (GEOS-5) is currently one of the widely used pollution prediction models developed by NASA's research team.

In this article, the potential use of fuzzy soft set theory in real-time haze warning is investigated. The main aims of this research are (i) to provide a haze warning system based on real-time atmospheric data and (ii) to identify the most hazardous location of the study area. The benefits are to create the awareness for people in the affected area and to suggest the location to establish pollution mitigation devices. Molodtsov [810] initiated the concept of soft set theory as a new mathematical tool for dealing with uncertainties. Soft set theory has rich potential for applications in several directions, a few of which had been shown by Molodtsov [8]. The idea of applying fuzzy soft set theory in atmospheric models is already considered concerning the applications to air pollution management [1114] and water management [1520]. However, it is believed that air pollution models may be different for each region due to many several factors [21, 22]. Therefore, existing models still need to be restudied. Up to our knowledge, there are only a few prediction models in the region of study since the main concerns are on the site of environmental science. The prediction results from GEOS-5 model are popular choices to be used as benchmarks for environmental scientist. The regional-developed models include a logistic regression model [23] and Geographic Information System- (GIS-) based model [24]. Therefore, our model would offer an alternative prediction model for the haze pollution problem.

The study location covered eight provinces in the northern part of Thailand where haze problem has severely occurred: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, and Nan. The density of PM10 is used as severity index of the haze pollution level. Additionally, seven principle parameters are considered in the model: six are atmospheric parameters—PM10, air pressure, relative humidity, wind speed, rainfall, and temperature—and the other one is the topographic parameter. All atmospheric data are obtained from the Pollution Control Department [1]. The obtained data period is from 1st January 2016 to 31st May 2016.

The rest of this article is organized as follows. In Section 2, we explain the methodology and present some examples. In Section 3, we describe the setup of the model, which includes the study location, the data, and the parameters. Then, we present our decision-making results and discussion in Section 4. Finally, the conclusion is given in Section 5.

2. Methodology

2.1. Fuzzy Soft Theory

In this section, we provide useful notations of soft sets and fuzzy soft sets. Let be an initial universal set and let be a set of parameters.

Definition 1 (see [8]). Let denote the power set of and . A pair is called a soft set over , where is a mapping given by

Example 1. Let the initial universe be the eight selected provinces in the northern region of Thailand: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, and Nan. Moreover, let be atmospheric parameters: PM10 density, air pressure, relative humidity, and wind speed, respectively. Then, an example of possible soft set is Note that each approximation has two parts, predicate and approximate value set. For example, the predicate is PM10 density and the approximate value set is for . Additionally, the summary information of this soft set is represented in Table 1.

Definition 2 (see [25]). Let denote the set of all fuzzy sets of and let . A pair is called a fuzzy soft set over , where is a mapping given by .

Example 2. We consider the same setup as in Example 1. An example of a fuzzy soft set isTable 2 provides the summary information of this fuzzy soft set.

Definition 3. For a given fuzzy soft set with a universal set and parameter set P, we denote as the membership value of in

Definition 4 (see [25]). For a given fuzzy soft set, the choice value of is defined by

Definition 5 (see [25]). For a given fuzzy soft set, the comparison table is the table, in which the entry is the number of parameters for which the membership value of exceeds or equals the membership value of . Both row and column of the table are labelled by the elements of the universal set.

Remark 1. (1)Each main diagonal element of a comparison table is always equal to n.(2) for all i, j

Definition 6 (see [25]). (i)Impact indicator of is the sum of all values on row on the comparison table. This can be calculated by the following formula:(ii)Divider indicator of is the sum of all values on column on the comparison table. This can be calculated by the following formula:(iii)The score value of is defined asBoth values can be used as evaluations in a decision making. However, according to Kong et al. [26], it is possible that these values may lead to different decision results. Therefore, they introduced grey relational grade, a new evaluation indicator that combines both information of score values and choice values, to make the decision making more robust. The calculation algorithm of grey relational grade is briefly presented.

Algorithm 1 (see [26]). Decision making based on grey relational grade.(1)Input the choice value sequence and the score sequence where and are the choice value and the score value of , respectively.(2)Calculate grey relational generating values:(3)Calculate grey difference information:(4)Calculate grey relative coefficients:(5)Calculate grey relational grade:(6)The decision is if . Optimal choices may have more than one if there are more than one element corresponding to the maximum.Decision making based on score values and choice values relies on the assumption that the parameters are equally important. However, in some decision-making problems, some parameters can be more important than the others. Therefore, we propose new definitions of choice values and score values based on weight information. Note that idea of weighted score value is briefly discussed in Maji et al. [9].
Define a weight as weight sequence of parameters where is the weight associated with the parameter .

Definition 7. For a given fuzzy soft set and a weight , the weighted choice value of is defined by

Definition 8. For a given fuzzy soft set and a weight , the weighted comparison table is the table, in which the entry is calculated by the following formula:where is an indicator function defined byIn other words, this is the weighted sum of parameters which the membership value of exceeds or equals the membership value of . Both row and column of the table are labelled by the elements of the universal set.

Remark 2. (1)Each main diagonal element of a weighted comparison table is always equal to the sum .(2) for all i, j.

Definition 9. (i)A weighted impact indicator is the impact indicator that is calculated based on the weighted comparison table associated with the weight .(ii)A weighted divider indicator is the divider indicator that is calculated based on the weighted comparison table associated with the weight .(iii)The weighted score values of are the score values that are calculated based on the weighted impact indicator and weighted divider indicator associated with the weight .Apparently, if , the weighted choice values and the weight score values are equal to the choice values and the score values defined in Definitions 4 and 6.

Example 3. In a decision-making problem with , define the weight . Suppose that a fuzzy soft set isOur aim is to choose the optimum choice according to the weight . By Definition 7, the weighted choice value sequence is . Next, we calculate weighted score value. By Definition 8, the weighted comparison table is shown in Table 3. Then, by Definition 9, the weighted impact indicator, the weighted divider indicator, and the weighted score values are shown in Table 4. The weighted score value sequence is . Finally, we make a decision based on grey relational grade. The calculation of Algorithm 1 is shown in Table 5. Therefore, is the optimal choice.

2.2. Particle Swarm Optimization

The particle swarm optimization (PSO) algorithm is a metaheuristic algorithm based on the concept of swarm intelligence. The algorithm was proposed in 1995 by Kennedy and Eberhart [27]. PSO is metaheuristic as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. Also, PSO does not use the gradient of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by classic optimization methods such as gradient descent and quasi-Newton methods. Also, it is capable of solving complex mathematics problems existing in engineering [28].

This method is now available to use in computer packages such as Matlab or R.

3. Model Construction

3.1. The Study Area

Our study area is in the northern region of Thailand, the haze pollution affected area. The region, approximately 94,000 km2 in size and six million in population, consists of nine provinces: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, Nan, and Uttaradit. For this case study, Uttaradit was excluded since its haze problem was not severe. The study area is geographically characterised by several mountain ranges, which continue from the Shan Hills in bordering Myanmar to Laos, and the river valleys which cut through them. The basins of rivers Ping, Wang, Yom, and Nan run from north to south. The basins cut across the mountains of two great ranges, the Thanon Thong Chai Range in the west and the Phi Pan Nam in the east. All studied provinces lie between these basins. The elevations are generally moderate, a little above 2,000 metres (6,600 ft) for the highest summit. Table 6 provides the geographic information summary of each province. The latitudes and longitudes shown are the locations of meteorology stations where atmospheric data are collected. The basin sizes are divided into five categories: no basin, wide, normal, moderate, and narrow, and we set the airflow difficulty level of each category to be 0, 1, 2, 3, and 4, respectively. The narrow basin implies that the flow of the air is more difficult. The location map of study area is shown in Figure 1.

3.2. The Data

The hourly atmospheric data of PM10 density (μg/m3, at 3 m from ground), air pressure (mmHg, at 2 m), relative humidity (%RH, at 2 m), wind speed (m/s at 30 m), rainfall (mm at 3 m), and temperature (°C at 2 m) from 1st January 2016 to 31st May 2016 were obtained with authorization from the Pollution Control Department [1]. About 3% of data was missing from the record. The missing data were replaced by the same data at the preceding time. Figure 2 represents the daily fluctuation of PM10 density of the eight selected locations during the study period. Table 7 represents the summary statistics of PM10 density of the eight selected locations.

3.3. The Parameters

Based on environmental research studies [3033], the climate and the topography of the study area play significant roles in the pollution problem. Therefore, the parameter set consists of seven parameters in this application, which are PM10 density, air pressure, relative humidity, wind speed, rainfall, temperature, and airflow difficulty level. The first six parameters are atmospheric parameters, while the last parameter is topographic parameter. Additionally, the effects of each atmospheric components on the PM10 density, the severity index, can be categorized into two types; positive and negative. A positive atmospheric component is the component such that increasing in its value will lead to the increase of the PM10 density, while a negative atmospheric component is the component such that increasing in its value will lead to the decrease of the PM10 density. The parameter information is summarised in Table 8.

4. Results and Discussion

4.1. Haze Warning System

The first aim of this research is to create a warning system based on real-time atmospheric data. The system predicts whether the PM10 density will exceed the crisis level or not in the following 4 hours. Note that the length of warning period can be adjusted. In this article, we choose the period of 4 hours since the period of time is reasonable enough to do some safety mitigation such as buying protection masks, completing necessary outdoor activities, or evacuating to public designated safe zones. The warnings will be set to be announced at 12 a.m., 4 a.m., 8 p.m., 4 p.m., 8 p.m., and 12 p.m. each day. The PM10 crisis level is set at 120 μg/m3 based on Thailand national ambient air quality standard [34].

4.1.1. Warning System Based on PM10 Density

The trivial warning system is a warning that relies on the information of the PM10 density only. That is, a warning is signaled when the PM10 density at current time exceeds a certain threshold value. The warning system is generated by Algorithm 2.

Algorithm 2. Haze warning system based on PM10 density.(1)At the warning time, input PM10 density data of each location. The inputted data are the average of hourly data of the components in the preceding 4 hours.(2)A warning is signaled if the PM10 density of the location exceeds the threshold value .The efficiency of the algorithm is evaluated by the accuracy ratio compared to the real data. The prediction is counted as accurate if the warning is signaled and the PM10 density in the next 4 hours exceeds 120 μg/m3 or the warning is not signaled and the PM10 density in the next 4 hours does not exceed 120 μg/m3. We test the algorithm with which are, respectively, 70%, 80%, 90%, 95%, and 98% of the crisis level. The accuracy ratios of each threshold values are shown in Table 9. The plot between the average accuracy ratio of all eight locations and the threshold values is shown in Figure 3. It can be seen that the best threshold value for these data is 118 (98% of the crisis level) with 90.99% accuracy ratio.

4.1.2. Warning System Based on Fuzzy Soft Set with Weighted Information

To improve the efficiency of the warning system, the fuzzy soft set with weighted information can be comprised. Note that the fuzzy soft set without weights is not suitable for this model. This is due to the fact that the importance of the parameters is not the same. For instance, PM10 density parameter is the most important parameter than the other parameters for the reason that no haze problem will occur if the PM10 density amount is low. It should be noted that the membership values of the atmospheric parameters change in every warning based on the real-time data, while the topographic parameter remains the same throughout the time period. Additionally, when the weighted information is , this warning system turns out to be the warning system based on PM10 density defined in Section 4.1.1.

The choice values are used in decision making. For this system, a warning is signaled when the weighted choice values at current time exceed a certain threshold value.

Our proposed decision making for the warning system with weighted information is as follows.

Algorithm 3. Haze warning system based on weighted choice values.(1)At the warning time, input the atmospheric data of each location: PM10 density, air pressure, relative humidity, wind speed, rain, and temperature. The inputted data are the average of hourly data of the components in the preceding 4 hours. Additionally, input the weight information .(2)Calculate the membership values of the parameters of the fuzzy soft set:(i)For PM10 density parameter, the membership values are calculated fromwhere is the inputted PM10 density data.(ii)For the other positive atmospheric parameters, the membership values are calculated fromwhere is the inputted atmospheric component data and and are the minimum value and the maximum value of the atmospheric component during January to May 2016, respectively.(iii)For negative atmospheric parameters, the membership values are calculated fromwhere , , and M are defined in (ii)(iv)For the topographic parameters, the membership values are 0, 0.25, 0.5, 0.75, and 1 when the airflow difficulty levels are 0, 1, 2, 3, and 4, respectively.(3)Calculate the weighted choice values according to the weight information of each location.(4)A warning is signaled if the choice values of the location exceed the threshold value where .

The flowchart of Algorithm 3 is given in Figure 4.

Clearly, the accuracy ratio of the model depends on the weight information and the threshold value. The calculation examples when weighted information is , and are shown in Table 10. In these examples, the threshold value is set to be 90% of the possible maximum values, which depend on the weight information.

Since the aim of this problem is to find the weight information and the threshold value that give the best accuracy ratio, this problem coincides with the optimization problem:

By employing the particle swarm optimization method in Matlab programme, the optimum average accuracy ratio is 92.12% with the optimum weight and the optimum threshold . This optimum result is shown in Table 11.

4.2. Identification of the Most Hazardous Location

The second aim of this research is to identify the location with the most serious haze pollution problem based on real-time atmospheric data. The location is identified at the same time as the warning. The effective prediction will benefit the community in the affected area and assist the authority to provide safety aids and prepare helping devices such as mobile air purifier.

4.2.1. Identification of the Most Hazardous Location Based on PM10 Density

Similar to Section 4.1.1, the simple decision making is to choose a location based on the information of PM10 density only. That is, the location with the highest value of PM10 density at current time is chosen as the most hazardous location in the following 4 hours.

The algorithm of the decision making is as follows.

Algorithm 4. Identification of the most hazardous location based on PM10 density(1)At the warning time, input PM10 density data of each location. The inputted data are the average of hourly data of the components in the preceding 4 hours.(2)The decision is , the location with the maximum value of PM10 density at current time. Optimal choices may have more than one if there are more than one element corresponding to the maximum.The efficiency of the algorithm is evaluated by the accuracy ratio compared to the real data. The prediction is counted as accurate if the most severe location in the next 4 hours is correctly identified. By making decision based on Algorithm 4, the average accuracy ratio from eight locations is 51.15% and Cohen’s kappa index of agreement is 0.4312.

4.2.2. Identification of the Most Hazardous Location Based on Fuzzy Soft Set with Weighted Information

The fuzzy soft set with weighted information can be comprised in order to improve the efficiency of the decision makings. With a similar reason to Section 4.1.2, the fuzzy soft set with weight is more suitable. Note that the membership values of the atmospheric parameters change in every decision making based on the real-time data, while the topographic parameter remains the same throughout the time period. Additionally, when the weighted information is , this decision making turns out to be the warning system based on PM10 density defined in Section 4.2.1. It should be emphasized that the membership calculation of PM10 density parameter is different from Algorithm 3. This is because we need to make a comparison of location.

Finally, the evaluation of decision making must be chosen. Note that it can be evaluated based on choice values, score values, or grey relation grade. In our result, we will use all three evaluations in order to choose which evaluation gives the best result.

Our proposed algorithm for decision making of the most hazardous location based on weighted choice values is as follows.

Algorithm 5. Identification of the most hazardous location based on weighted choice values.(1)At the warning time, input the atmospheric data of each location: PM10 density, air pressure, relative humidity, wind speed, rain, and temperature. The inputted data are the average of hourly data of the components in the preceding 4 hours. Additionally, input the weight information .(2)Calculate the membership values of the parameters of the fuzzy soft set:(i)For positive atmospheric parameters, the membership values are calculated fromwhere is the inputted atmospheric component data and and are the minimum value and the maximum value of the atmospheric component during January to May 2016, respectively.(ii)For negative atmospheric parameters, the membership values are calculated fromwhere , , and M are defined in (i).(iii)For the topographic parameters, the membership values are 0, 0.25, 0.5, 0.75, and 1 when the airflow difficulty levels are 0, 1, 2, 3, and 4, respectively.(3)Calculate the weighted choice values .(4)The decision is if . Optimal choices may have more than one if there are more than one element corresponding to the maximum.The flowchart of Algorithm 5 is given in Figure 5.

Remark 3. If the decision making is based on weighted score values or grey relational grades , then Step 3 and Step 4 of Algorithm 5 will be changed accordingly.
Clearly, the accuracy ratio of the model depends on the weight information. Table 12 displays the accuracy ratio of the location identification by Algorithm 5 where the decision makings are based on choice values, score values, and grey relational grades. The weighted information are and .
Since our desire of this problem is to find the weight information that gives the best accuracy ratio, this is similar to the optimization problem:By employing the particle swarm optimization method in Matlab programme, the optimum average accuracy ratio based on weighted choice values, weighted score values, and grey relational grades is 56.58%, 57.13%, and 57.02%, respectively. The summary of the optimum result is shown in Table 13. Cohen’s kappa of the decision making based on weighted choice values, weighted score values, and grey relational grades is 0.4457, 0.4521, and 0.4489, respectively.

4.3. Discussion
4.3.1. Haze Warning System

By introducing the fuzzy soft model with weighted information, the prediction accuracy ratio of the warning system is improved slightly from 90.99% to 92.12% compared to the simple warning system that only considers the PM10 density. Moreover, it is clear that the fuzzy soft models with weighted information provide better prediction than the original (equal weight) fuzzy soft model. Table 14 shows the parameters’ weights that provide the best accuracy ratio. Note that the principal parameters are PM10 density, rainfall, and wind speed, respectively, while the other parameters have no weight. This suggests that a simple judgment on the warning can be done by observing only PM10 density, wind speed, and rainfall. The problem is expected to be severe if PM10 density is high, with no wind and no rain. This agrees with the principle study in environmental science research.

4.3.2. Identification of the Most Hazardous Location

By selecting the most severe location based on the information from PM10 density only, the accuracy ratio is 51.12%. However, this ratio is improved to 57.13% when the locations are chosen by the fuzzy weight model. The decision making is decided by weighted score values. Table 15 shows the parameters’ weights that provide the best accuracy ratio.

Based on the optimal parameters’ weights, this would imply the following:(1)PM10 density is clearly the main factor in the decision making.(2)This result shows that topography plays a role in the haze pollution problem for this region of study.(3)Temperature, wind speed, and rainfall are factors in the model. Unfortunately, these atmospheric parameters are uncontrollable.(4)Air pressure and relative humidity have less or no impact for the prediction model.

This study analysis agrees with principle study in environmental science research. It should be emphasized that the only parameter that can be controlled is PM10 density. The activities that contribute to PM10 such as outdoor burn or car emissions should be disregarded.

4.3.3. Other Discussion

By the results from Sections 4.1 and 4.2, it should be pointed out that a simple warning system and location identification based on the information of PM10 density is reasonable enough. By adding the parameters, the efficiency of the model is improved very slightly. This emphasizes the fact that environmental modeling is complicated. However, since the calculation of our algorithm is not expensive, Algorithms 3 and 5 should still be in use to improve the decision-making problem.

For further works, our suggestions are to add the following parameters:Atmospheric parameters: PM2.5 density, SO2, ozone, and wind direction.Topographic parameters: height above sea level of the location, location of surrounded mountains, and height of surrounded mountains.Others parameters: population.

5. Conclusions

In this article, we propose a fuzzy soft model to benefit in the haze pollution management in northern Thailand. The main aims of this research are to provide a haze warning system based on real-time atmospheric data and to identify the most hazardous location of the study area. The study area covers eight provinces in the northern Thailand, where the problem severely occurs every year. The parameters of the fuzzy soft set include both atmospheric parameters and topographic parameter. The membership values of atmospheric parameters are calculated based on the real-time data. The efficiency of the model is tested with the real data from 1st January 2016 to 31st May 2016. The results show that our fuzzy models improve the prediction accuracy ratio compared to the prediction based on PM10 density only. The optimum results and optimum weights are chosen based on particle swarm optimization. The meaning of optimum weights also agrees with the principle study in environmental science research. Another benefit of our model is that the topographic parameter, which is normally being disregarded from many models, is included. Moreover, our model would offer an alternative prediction model for the haze pollution problem in northern Thailand.

The fuzzy soft set approach in the application to haze pollution management furnishes very promising prospect and possibilities. We strongly believe that the efficiency of the model can be improved when appropriate parameters are added. The calculation formula for the membership values and the severity index can also be adjusted. The efficient model will clearly improve the health safety and raise the life quality of the sufferers.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest.

Acknowledgments

The author would like to thank Department of Environmental Science, Faculty of Science, Chiang Mai University, Thailand, and Pollution Control Department, Ministry of Natural Resource and Environment, Thailand, for providing the atmospheric data. This research was supported by the Centre of Excellence in Mathematics, CHE, and Chiang Mai University.