Abstract

Fuzzy time series (FTS) is one of the forecasting methods that has been developed until now. The fuzzy time series is a forecasting method that uses the concept of fuzzy logic, which Song and Chissom first introduced. The fuzzy time series (FTS) Markov chain uses the Markov chain in defuzzification. The determination of the length of the interval in the fuzzy time series plays an important role in forming a fuzzy logic relationship (FLR), and this FLR will be used to determine the forecasting value. One method that can be used to determine the interval length is average-based. However, several studies use partitioning based on frequency density to obtain the optimal interval length to get better forecasting accuracy. This study combines the fuzzy time series Markov chain, Average-based fuzzy time series, and Fuzzy time series based on frequency density partitioning to become average-based fuzzy time series Markov chain based on the Frequency Density Partition which conducts redivided intervals based on frequency density in the average-based fuzzy time series Markov chain method. This method is implemented in forecasting the Indonesian Islamic stock index (ISSI) for the selected period. The calculation of the accuracy level using the mean square error (MSE) and the mean average percentage error (MAPE) shows that the fuzzy Markov chain-based fuzzy time series based on the frequency density partition has a high level of accuracy in forecasting.

1. Introduction

A time series is defined as a collection of observations or observations made sequentially over time. Usually, observations in time series are not independent or can be said to be correlated. Thus, the order of the observations becomes important. This results in statistical procedures and techniques based on independent assumptions being no longer valid; thus, different methods and approaches are needed. Time series analysis aims for depiction, exposure, prediction, and monitoring [1].

Forecasting predicts a variable’s values based on the known values of that variable or related variables [2]. The rationale for the time series is that current observations depend on previous observations. Therefore, many types of forecasting use time series data, including the fuzzy time series method, smoothing, average, and moving average, and others.

The forecasting method using the fuzzy logic concept, hereinafter known as fuzzy time series, was first proposed by Song and Chissom in 1993. Song and Chissom used time-invariant and time-variant methods in forecasting. As a result, several fuzzy time series (FTS) methods have been developed, including Chen [3, 4], Chen and Hsu [5], weighted [6], backpropagation [7], multiple-attribute [8], percentage change [9], and the Markov chain [10].

The Markov chain is a stochastic process where future events only depend on today’s events and do not depend on past conditions. The Markov chain is defined by a transition opportunity matrix that contains information regulating the system’s movement from one state to another [10]. In fuzzy time series, the Markov chain is used in the defuzzification stage [10]. Defuzzification is the calculation step of fuzzy time series forecasting based on a fuzzy logical relationship group (FLRG). In FLRG, there is a relationship between the current and next states. The current state is the value that will be calculated as the predicted value, and the next state is the data used to get the value in the current state. Therefore, the relationship between the current state and the next state in the FLRG is considered a conditional process in line with the basic principles of the Markov chain method [10]. The Markov chain is used in several fields, one of which is the research of Indriyani and Pratiwi, et al. [11], which uses the Markov chain in the measles spread pattern. The Markov chain method is also used by Prasetya and Ferdian [12] in scheduling Oerlikon machine maintenance to optimize maintenance costs and time. The results show that calculations with the Markov chains produce more optimal time and costs.

Ruey-Chyn Tsaur conducted a study by combining the fuzzy time series method with the Markov chain concept to predict the Taiwan currency exchange rate against the dollar. The method used is known as the fuzzy time series Markov chain. The results obtained are the fuzzy time series Markov chain that has a better level of accuracy than the fuzzy time series [10]. Many studies have used the FTS-Markov chain including Dinatha et al. [13] who used the FTS-Markov chain to predict export profits. In addition, Hidayah and Sugiman [14] and Mangkunegara and Yerizon [15] both use the FTS-Markov chain to forecast exchange rates. Their research [1315] produced a small error value, but in determining the FLR, there are still shortcomings and will produce different forecasting results because the length of the interval is determined according to the perception of each researcher.

In the fuzzy time series method, the determination of the interval length does not have a definite formula in its calculation; the interval is formed depending on the researcher [16], even though the determination of the length of the interval is very influential on the formation of a fuzzy logical relationship (FLR) which will result in differences in the results of forecasting calculations [17].

One method that can be used to determine the length of the interval is the average-based model which was introduced by Xihao and Yimin [17]. This average-based fuzzy time series uses an average-based method in determining the length of the interval. Research by Xihao and Yimin [17] also shows that the use of the average-based fuzzy time series method produces better forecasts than Chen’s fuzzy time series method. Wuryanto and Puspita [18] and Kumar N. and Kumar H. [19] also used the average-based FTS model to predict the development of confirmed cases of COVID-19. Research [18, 19] has a small error value, but the interval length is still less than optimal because each class interval has the same interval length regardless of the frequency in each class.

Furthermore, Chen and Hsu [5] developed Chen’s fuzzy time series, by repartitioning based on frequency density to predict the number of applicants at the University of Alabama. Chen and Hsu redivided intervals before the fuzzification process. Chen and Hsu’s research shows that after the redivided interval, the fuzzy time series gets a better accuracy value than other existing fuzzy time series. Irawanto et al. [20] used frequency density-based partitioning for stock index forecasting. Wulandari et al. [21] used frequency density partitioning for forecasting the production of petroleum which resulted in a small error value.

Based on the description above, the researcher is interested in examining the average-based fuzzy time series Markov chain based on frequency density partitioning (FDP). Because based on the introduction above that the researchers have read, there has been no research that has conducted redivided intervals based on frequency density in the average-based fuzzy time series Markov chain method, and because frequency density partitioning produces subintervals, which is based on empirical analysis, these subintervals cause the fuzzy numbers to get closer to the crisp value. This idea is explained in the following chart in Figure 1.

This method is applied to forecast the Indonesian Islamic stock index (ISSI). Furthermore, to see the level of accuracy of the method, the mean square error (MSE) and the mean average percentage error (MAPE) are used. Then, for comparison, the researcher used Chen’s fuzzy time series.

2. Basic Knowledge

2.1. Fuzzy Time Series

The definition of fuzzy time series was first introduced by Song and Chissom [16]. Let be the universe of discourse, with on a fuzzy set defined as where is the membership of the fuzzy set , is an element of the fuzzy set , and shows the degree of membership of in , where .

Definition 1. We assume subset of , which becomes the universe discourse, where the fuzzy set has been defined previously. Then, if is a collection of , then is called fuzzy time series on [5].

Definition 2. If is caused by , then the relation in the first order rmodel can be stated as follows [3]: where “” is the Max–Min composition operator and is a relation matrix to describe the fuzzy relationship between and .

2.2. Average-Based Algorithm

Average-based algorithm is an algorithm that can be used to set the interval length that is determined at the initial stage of forecasting when using fuzzy time series. The steps of the average-based algorithm are as follows [17]: (a)Determine the absolute difference (lag) between data and data with the formula:(b)Determine the length of the interval(c)Based on the interval length obtained from step (b), determine the basis value of the interval length according to Table 1(d)The length of the interval is then rounded up according to the interval basis table

2.3. Frequency Density Partition

Chen and Hsu [5] developed a fuzzy time series by redivided intervals based on frequency density. In his research, after partitioning the universe discourse into intervals of equal length, subpartition the intervals of the same length based on frequency density. By rule, (a)the interval with the first densest frequency is divided into 4 subintervals(b)the interval with the second densest frequency is divided into 3 subintervals(c)the interval with the third densest frequency is divided into 2 subintervals

2.4. Markov Chain

Markov first introduced the Markov chain in 1906. The Markov chain analysis is a method that studies the properties of the past to estimate the properties of these variables in the future. Conceptually, the Markov chain can be described by assuming as a finite stochastic process or the probability value can be calculated. The set of probability values of the stochastic process is denoted by the set of positive integers .

If , then this process occurs in when , assuming that whenever this process occurs in state , on a point of probability who will move to the state . Thus, it can be written as follows:

For all states . This process is called the Markov chain.

The above equation is interpreted in the Markov chain as a conditional distribution of the future state obtained from the previous state and the current state and does not depend on the previous state but depends on current state.

The value of represents the probability of the transition process from to . Because the probability value is always positive and the transition process moves, then and , sum , and Let be the transition probability matrix ; then, it can be denoted in the following equation [10]:

The Markov chain process in the fuzzy time series used is a transition probability matrix. The transition probability matrix is used as the basis for forecasting calculations. The probability from the current state to the next state is obtained from the FLRG. State transition probabilities are written as follows [10]:

where is the transition probability of state to state one step, is the number of transitions from state to state one step, and is the amount of data included in the state .

The probability matrix of all states is dimension , with being the number of fuzzy sets, and can be written as follows [10]:

2.5. Average-Based Fuzzy Time Series Markov Chain Based on Frequency Density Partitioning

Average-based fuzzy time series Markov chain based on frequency density partition is a forecasting method using fuzzy time series combined with the Markov chain and using average-based as interval determination which is then redivided interval based on frequency density.

2.6. Forecasting Error Measurement

The reliability of a forecast can be determined by looking at the mean square error (MSE) and mean average percentage error (MAPE); these are the MSE and MAPE formulas [1]: where is the actual data period , is the period forecasting value, and is the predictable amount of data.

3. Materials and Methods

3.1. Data Collection

The data used in the study is the weekly data on the Indonesian Sharia Stock Price Index (ISSI) for the period June 2019–May 2021, which was obtained from the http://yahoo.finance.com site. The results of the forecasting test are then validated using the MSE and MAPE values. Furthermore, it is compared with Chen’s fuzzy time series.

3.2. Forecasting Method

This study uses an average-based fuzzy time series Markov chain based on frequency density partitioning. The difference between this study and the previous research lies in the use of density partitioning frequencies in the average fuzzy-based Markov time series chain which will make the fuzzy values closer to the crisp values, so that the forecasting values have a good accuracy value. The steps are as follows [5, 10, 17, 22]: (a)Define the universe of discourse(b)Divide the universe of discourse into intervals using the average-based method(c)Distribute all research data into intervals(d)Determine the frequency density(e)Perform redivided intervals based on frequency density(f)Define a fuzzy set. Let be a fuzzy set that has a linguistic value from a linguistic variable; the definition of fuzzy set in the universe of discourse is as follows:where is an element of the universe of discourse () (g)Fuzzification of historical data(h)Define FLR(i)Define FLRG(j)Determine the transition probability with the formula:and determine the probability matrix of all states of the FLRG with dimensions of , where is the number of fuzzy sets that can be written as follows: (k)Calculate the initial forecast value using a probability matrix with the following rules:

Rule 1: if the FLRG is one to one (e.g., , where and ), the forecast is which is the middle value of with the following equation:

Rule 2: if the FLRG is one too many (e.g., . ), when at time is included in state , then the forecasting is

where is the middle value and are state values at time

Rule 3: if the FLRG is empty (), forecast value is which is the middle value of with the following equation: (l)Adjusting the trend of forecasting values with the following rules:(i)If state communicates with , starting from state at time expressed as and undergoing an increasing transition to state at the time where (), then the adjustment value is where is the basis interval (ii)If state communicates with , starting from state at the time expressed as and experiencing a decreasing transition to state at the time , where (), the adjustment value is(iii)If state at the time is expressed as and undergoes a jump forward transition to state at the time , where (), then the adjustment value iswhere is the number of forwarding jumps (iv)If state at the time is as and undergoes a jump-backward transition to state at the time , where (), then the adjustment value iswhere is the number of jumps backward (m)Determine the final forecast value based on the adjustment of the trend of the forecasting value.

If FLRG is one to many and state can be accessed from state , where state is related to , then the forecasting result becomes . If FLRG is one to many and state can be accessed from , where state is not related to , then the forecasting values become If FLRG is one to many and state can be accessed from state , where is not related to , then the forecasting result is If is a jump step, the general form of the forecast is

4. Results and Discussion

Average-based fuzzy time series Markov chain based on frequency density partitioning was tested on the Indonesian Sharia Stock Price Index (ISSI) forecasting to see whether this method could optimize the interval on the FLR to produce more accurate forecasts.

In forecasting using an average-based fuzzy time series Markov chain based on frequency density partitioning, the first step is to collect ISSI historical data obtained as many as 104 data. The data is used to determine the universe of discourse. Furthermore, dividing into several subsets using an average based is as follows: (a)Determine the smallest values () and greatest value () obtained and and the value of and so that it can be defined (b)Calculate the absolute difference between the data and using equations ; as an example, data to 1 is as follows:

The calculation above is also used for the second data and so on; the total difference from the data is 304.62. Furthermore, the difference in the data is calculated on average using the equation , so the obtained average absolute difference is 2.96. (c)The result is then divided by 2 to get 1.48(d)The value of 1.48 is then determined using Table 1, and the basis for the length of the interval is 1(e) can be a partition into the same interval length, namely,; successively, the value for each interval is

The next step is to distribute the data to each interval and determine the frequency density; we get as the first densest frequency with a frequency of 7, which is further divided into 4 subintervals (, , , and ), the second most populous is divided into 3 subintervals (, , and ), the third most populous is and divided into 2 subintervals ( and and and , respectively) and eliminates intervals that have no frequency, i.e., , and Next, look for the middle value () of each interval; we get table middle value at Table 2.

Table 2 shows the mean value of each interval that has been repartitioned based on frequency density. This middle value is used to calculate the initial forecast value on the fuzzy time series. The following is an example of calculating the mean value :

Furthermore, defining fuzzy sets, fuzzy sets that can be formed from the universe of conversation are 53 fuzzy sets. Based on equation (8), the fuzzy set formed is as follows:

The next step is to perform fuzzification; the data from the fuzzification results are presented in Table 3.

Table 3 shows the results of ISSI weekly data fuzzification; fuzzification is performed to convert firm values into fuzzy values. An example of the fuzzification process for data on June 9, 2019 (), is 182.76 entered in the interval . Next, the formed fuzzy set has a membership degree of 1 when it is in the fuzzy set , so that for the 9 June 2019 data, the fuzzified data obtained is .

After fuzzification is obtained, the next step is to determine FLR and FLRG, presented in Tables 4 and 5.

Table 4 shows that data 1 is fuzzified at and the second data is fuzzified at so that the FLR is . FLR plays an important role because it is FLR that is used to determine forecasting values.

Based on Table 5, all FLR formed in Table 4 are grouped into interconnected FLRG, for example, FLRG on , where is the current state and has a relationship to and . These 2 FLR are grouped into 1 FLRG, namely, .

The next step is to calculate the initial forecast; FLRG is used to form a Markov chain and transition probability matrix. In this study, a transition probability matrix of the order was formed, where each element is a probability value obtained from equation (7), for example, the calculation of the elements of the probability matrix for , with and , because state transitions to another state 3 times, namely, 2 times to state and 1 time to state ; then, and . The same method is used to determine each element on the probability matrix so that the matrix can be as follows:

After the probability matrix element values are obtained, the next step is to calculate the initial starting value using formulas (14), (15), and (16). For example, for June 16, 2019 (), the data seen is the previous week’s data, namely, June 9, 2019 (), where the state transitions from ; then, the forecast calculation ; the same steps are also used for . The summary of the initial forecasting results is shown in Table 6.

Table 6 shows the initial forecasting values for the period 16 June 2019 to 30 May 2021. These initial forecasting values were obtained from the defuzzification results of the FLRG group.

The next step is to adjust the forecasting trend. For example, the adjustment value for June 16, 2019, the next state is , and the current state is ; then, the adjustment calculation uses the forecast adjustment rule point c with equation (19) . For the calculation of other forecasting value adjustments, use equations (17), (18), (19), and (20).

After the adjustment value is obtained, then calculate the final forecast value. For the calculation of the adjusted forecast value, follow the existing rules in equation (21). For example, calculations for adjusted forecast values ; by doing the same way, the summary of the final forecasting results is shown in Table 7.

Table 7 shows the results of the final forecast that has made some adjustments. This final forecasting value is obtained from the sum of the initial forecasting value with the adjustment value. The initial forecasting value that has made several adjustments produces a final forecasting value that is closer to the actual data. The comparison of the final forecasting value with actual data can be seen in Figure 2.

Figure 2 shows a comparison of forecasting results using an average-based Markov Chain based on frequency density partitioning. The graph in blue shows the actual data, and the graph in orange shows the results of forecasting using an average-based fuzzy time series Markov chain based on frequency density partitions. Average-based fuzzy time series Markov chain based on frequency density partitioning shows a pattern that is almost the same as the actual data, although the resulting forecasting value is not the same as the actual data; the pattern of forecasting values uses the average-based fuzzy time series Markov chain based on density partitioning the frequency that follows the pattern of the actual data.

The last step is to calculate the forecast accuracy value using MSE and MAPE. For the MSE and MAPE values, this calculation uses formulas (9) and (10), respectively; the average-based fuzzy time series method based on FDP is 5.76 and 1.04%. This shows good forecasting performance on the average-based fuzzy time series Markov chain based on frequency density partitioning as shown in Figure 2, where the forecasting value is closer to the actual value.

The good accuracy value on the fuzzy time series Markov chain is obtained due to the use of frequency density partitions which produce subintervals so that the fuzzy values can be close to the crisp values. This is also supported by the use of average-based method to determine the optimal interval length.

5. Conclusions

Forecasting using the average-based fuzzy time series Markov chain based on the frequency density partition (FDP) has a good accuracy value; this can be seen from the MSE and MAPE values of 5.76 and 1.04%, respectively. This is a good accuracy value because the Markov chain fuzzy time series uses an average-based method to determine the length of the interval so that the length of the interval used is not just the perception of the researcher. The length of this interval is then partitioned based on frequency density to obtain a more optimal interval length. Determination of the length of the interval on the FTS-Markov chain plays an important role in forming a fuzzy logical relationship (FLR), and this FLR is used to determine the forecast value.

Data Availability

The table data used to support the findings of this study are included within the supplementary information file and also within the article and are also available online at https://g.co/finance/ISSI:IDX?window=5Y.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This research was funded in addition to the state budget (Selain APBN) of Diponegoro University in 2022 (grant number 215/UN7.A/HK/VII/2022 and SPK Number 569-113/UN7.D2/PP/VII/2022). We would like to thank the chancellor of Diponegoro University and all parties involved in this paper.