Abstract

The timely and accurate forecasting of urban road traffic is crucial for smart city traffic management and control. It can assist both drivers and traffic controllers in selecting efficient routes and diverting traffic to less congested roads. However, estimating traffic volume while taking into account external factors such as weather and accidents is still a challenge. In this research, we propose a hybrid deep learning framework, double attention graph neural network BiLSTM (DAGNBL), that utilizes a graph neural network to represent spatial characteristics and bidirectional LSTM units to capture temporal dependencies between features. Attention modules are added to the GNN and BLSTM to find high-impact attention weight values for the chosen road section. Our model offers the best prediction accuracy with a mean absolute percentage error of 5.21% and a root mean squared error of 4. It can be utilized as a useful tool for predicting traffic flow on certain stretches of road.

1. Introduction

Traffic flow forecasting is a crucial task for transportation management and decision-making. Accurate traffic flow predictions can help to improve traffic control, optimize transportation infrastructure, and reduce travel time and fuel consumption [1, 2]. However, traditional methods such as time series analysis and regression models often fail to capture the complex spatiotemporal patterns and correlations with weather data that can significantly impact traffic flow [3].

In recent years, there has been a growing interest in using deep learning models for traffic flow forecasting. Bidirectional long short-term memory (BiLSTM) networks have been shown to be effective in capturing the temporal dependencies of traffic flow [4]. Additionally, graph neural networks (GNNs) have been proposed to model the spatial dependencies and interactions between traffic nodes [5]. However, these models are often limited to using historical traffic flow data as input and do not consider the impact of weather conditions on traffic flow.

In this study, we propose a novel approach for traffic flow forecasting that combines the power of BiLSTM and GNN models with correlated weather data. Our approach leverages the ability of BiLSTM to capture the temporal dependencies of traffic flow and the ability of GNN to model the spatial dependencies and interactions between traffic nodes. We also incorporate weather data as an additional input to capture the impact of weather conditions on traffic flow. Through extensive experiments on real-world traffic flow data, we demonstrate the effectiveness and superiority of our proposed model in comparison to state-of-the-art methods.

The main goal of the research is to develop a spatiotemporal traffic flow forecasting model that captures the complex interactions between traffic flow and weather data and to improve the accuracy of traffic flow predictions. The main contributions of this research are highlighted as follows:(1)We propose a novel hybrid model, named DAGNBL, that combines graph neural networks and bidirectional long short-term memory networks to forecast traffic flow in a particular area.(2)We take into account the impact of nonrecurrent events such as weather changes on traffic flow by incorporating local meteorological information into the model, which leads to a better understanding of spatiotemporal deviations of traffic patterns.(3)We implement a double attention approach to enhance the performance of our model and enable it to learn the dynamic spatial-temporal correlations of traffic data. Specifically, a spatial attention is used to simulate the complex spatial relationships between different locations, and a temporal attention is employed to capture the dynamic temporal links between different times.

The rest of this essay is organised as follows. The state-of-the-art methodologies for traffic flow prediction are reviewed in Section 2. The baseline models and methods are described in Section 3 along with our suggested model and in-depth description of the datasets. An experimental setup is presented in Section 4. Section 5 presents the findings of the experiment. The study is concluded in Section 6, which also suggests areas for future research.

2. Literature Review

Over the past few years, a lot of researchers have focused on the issue of traffic flow forecasting, driven mostly by the advantages it can offer in real-time traffic monitoring, including the authors of [613]. However, as most of the research focuses on the continuation of the existing situation into the future, the outcomes of these projections are frequently not very accurate. Traffic patterns are influenced by many factors such as construction and maintenance of road or roadside infrastructure, population or employment changes, holidays and other special events, weather, and even accidents caused by human error. In light of these difficulties, we intend to look into the issue of urban traffic flow forecasting using one of the external factors mentioned previously, notably weather circumstances. Additionally, encouraging is the fact that during the past 10 years, IoT has promised to increase the knowledge and productivity of transportation organisations. Sensors and other IoT-enabled equipment are able to gather and communicate data about activity occurring in the road network in real time. Transportation management can then examine the data obtained from these devices to manage the flow of traffic on the roads. The data can vary from vehicle detection, vehicle volume, pressure and speed measurement, road surface conditions, and road weather conditions [14, 15]. The road traffic data which are needed for intelligent transport system are gathered from any of the sources as shown in Figure 1.The difficulty of predicting traffic levels should ideally be divided into long-term forecasting as addressed in [1620] and short-term forecasting as explained by the authors of [2124]. Short-term forecasts usually incorporate only a few months’ worth of data from a small number of sensors, and they frequently focus on the near future, i.e., predicting 10–15 minutes in the future. Long-term forecasting, on the other hand, necessitates data from numerous sensors gathered over a comparatively longer period, such as an hour or a day. This assists stakeholders in making long-term decisions, such as allowing passengers to schedule their travel according to peak traffic hours and the government to plan the construction of a flyover or overhead bridge in response to a route that consistently experiences traffic congestion. Time-series modelling is frequently used to tackle both types of forecasts and examine the difficulties in projecting traffic flow at a specific site. Usually, one attempts to forecast the value of a variable using a series of historical samples obtained at predetermined intervals. But predicting traffic is an extremely difficult task. Traffic volume and flow are not just dependent on the driver or the vehicle. However, a number of other outside variables, such as on-road incidents or changes in the environment and weather, also have a significant impact on how traffic patterns change on the roads. When such external factors are present, the simple process of time series forecasting becomes a multivariate time series forecasting task. It is a difficult problem since one must simultaneously take into account intraseries temporal correlations and interseries correlations. However, these long- and short-term temporal patterns are now easily analysed, learned, and predicted with accuracy, thanks to the development of machine learning and deep learning algorithms powered and fueled up by big data IoT devices. These algorithms have been useful in many fields, including forecasting of traffic [2527], energy use [28], stock market analysis [29], pandemic outbreak [30], sales analysis [31], and price prediction [32]. Thus, it can be stated that, for the problem of forecasting traffic flow, if only a single site’s traffic volume, occupancy, or flow is taken into account, it will be classified as univariate time series forecasting. But in a larger sense, if data from several locations are used to analyse traffic flow and its association, the problem is a standard multivariate time series forecasting one.

Figure 2 depicts the typical sequence of steps required to use a deep learning pipeline to forecast the predicted traffic value. The diagram demonstrates how data from various IoT sensors are first gathered, followed by the extraction and learning of spatiotemporal correlations.

It is clear from the abovementioned explanation that predicting traffic in isolation without using neighbourhood traffic patterns or external factors, particularly weather conditions, is not very effective. The traffic volume data gathered from all nearby sensors should be collected along with the external weather conditions since they have a significant impact on traffic flow for real-time and accurate traffic flow forecasting of a specific area. Traffic flow prediction research falls under the categories of parametric, machine learning, deep learning, and hybrid techniques as shown in Figure 3.

In this section, we will go over the research that has already been performed employing techniques for predicting both typical traffic patterns and traffic under difficult conditions.

2.1. Forecasting Traffic under Regular Road Conditions

To forecast traffic flow, Xia et al. in [33] suggested a bidirectional LSTM network with attention and a normal distribution module. The attention mechanism is used to identify the high-impact attention weight values that have an impact on the targeted road segment, and it employs a five-second time window for the road segment. The normal distribution is utilized to identify the influence of spatial correlation. In another study [34], Wei and Sheng suggested a hybrid model consisting of graph attention network and LSTM network is proposed. In their work, they used a dynamic adjacency matrix to depict the spatial dependencies of the topological road network. LSTM network was used to extract dynamic temporal features. Guo et al. in [35] use a fusion of spatial and temporal attention modules to forecast traffic flow. They used graph convolutions for spatial dependencies and normal convolutions for temporal dependencies. However, their work also lacks the external influencing factors such as accidents or weather-changing conditions. Li et al. in [36] also have a similar approach where temporal and spatial attention modules are used, and a layer of dynamic graph convolution neural network is used to find the data. Thus, their approach incorporates a multisensor data connection convolution block with a benchmark adaptive mechanism correlation. Lu et al. in [37] suggested LSTM outfitted with temporally aware convolutional context (TCC) and loss-switch mechanism (LSM) blocks. To suppress the data outliers, Chen et al. in [38] used a variety of denoising techniques, such as empirical and ensemble empirical mode decomposition and wavelet. The LSTM model’s training data are used to make the predictions. In some other works, Ali et al. used support vector regression [39] and graph convolution networks with dynamic hash tables [40] to forecast traffic flows. Qiao et al. in [41] use 1D CNN with LSTM to predict traffic flows in urban city. In [42], Bohan proposes a bidirectional recurrent neural network to predict traffic states utilizing both historical and future data in training the network.

2.2. Forecasting Traffic Correlated with Weather Information

In a simple work performed by Jia et al. [43], first, an image matrix is constructed using the urban traffic data inflow and outflow. Then, the self-attention module is used to discover the internal relationships between pixels and record the internal organisation of the image. Finally, a deep Res2Net module is employed to forecast how many people will go through each area of the city based on previous trajectories and vacations. Zhang et al. in [44] and Ye et al. in [45, 46] both used graph convolution neural network with attention mechanism and considered external factors while making predictions. Cui et al. in [47] use a stacked approach where the first layer is of BLSTM and the last layer is of LSTM. Their model has a sandwiched layer of either LSTM or BLSTM between the first and the last layers for capturing the spatiotemporal dependencies. A very similar work is proposed by Ma et al. in [48]. In conclusion, numerous studies using trustworthy LSTM models and graph neural networks have been explored in the literature in relation to the topic of traffic prediction. The hopeful potential of BiLSTM models for future traffic time series predictions that take the temporal dependencies in the past, however, has received very little attention from studies. Furthermore, there has not been much study performed on a system that combines the strength of graph neural networks with bidirectional LSTM networks. Table 1 summarizes the main deep-learning-based research conducted in spatiotemporal-based traffic flow forecasting.

3. Materials and Methods

The baseline architectures that we used in our model and the model that we presented are described in-depth in this section.

3.1. Deep Bidirectional Long Short-Term Memory Network

An extension of the straightforward LSTM network is the deep bidirectional LSTM network proposed first by [58]. It operates with two LSTM cells in a single timestamp as opposed to its progenitor. The first is a forward LSTM cell, and the second is an LSTM cell in reverse. This should not be confused with the neural network’s forward pass and the backward pass. The forward and reverse cells receive only inputs, and the output is collected by sending it through the sigmoid activation function. This allows us to preserve the long-term dependencies between the data features. The overall structure of the bidirectional LSTM cell is depicted in Figure 4.

3.2. Graph Neural Network

Graph neural network is based on graph data structure consisting of a group of nodes and edges represented in Euclidean space. Nodes usually present the feature vectors, and edges maintain the relationship between the adjacent nodes. A GNN usually takes node attributes and finds embedding for each node. The idea of GNN was proposed by [5961]. A graph is usually represented as a set of nodes and edges.where represents set of nodes and , and E represents edge existing between any two nodes (i and j) in the graph.

A graph adjacency matrix represents all vertices labelled as rows and columns with a 0 or 1 value depending if there exists a connection between two nodes.

The objective of the graph neural network is to encode contextual graph information by combining the data from nearby nodes. Each node receives information from its neighbouring node at every iteration. After that, the information is merged with the already-existing features to create a useable function.

3.3. Spatiotemporal Graph Neural Network

A GNN that changes over time is called temporal graph neural networks and usually are represented with the following equation:where and represent the dynamic features of node and edge, respectively. For time series forecasting problem, it is important to combine a graph neural network with an RNN, LSTM, or GRU which lets the overall network to capture the spatial and temporal features together. Figure 5 illustrates the connection between the spatial and temporal features.

3.4. Attention Mechanism

One of the most important developments in deep learning model in recent years is the attention mechanism. It has been widely applied to many domains with many deep learning models such as convolution neural network [6267], recurrent neural network [65, 6870], long short-term memory network [62, 7174], autoencoders [7578], generative adversarial networks [7982], and variational autoencoders [78, 83, 84]. The use of attention mechanism in traffic flow prediction is studied by the authors of [85]. Their studies clearly indicate that the addition of attention mechanism helps in selecting only the most relevant features that are required from the spatial and temporal feature vector for forecasting.

3.5. Description of Datasets

For addressing the problem of intelligent transportation systems, there may be four main categories of data which may be required, including emergency information, information about vehicles, information about traffic facilities, and information about traffic flow [86]. CityPulse is a live broadcast of IoT from numerous sensors placed throughout the Danish city of Aarhus Road traffic, and pollution, weather, cultural, social, library event, and parking data are among the datasets that are available [39, 87, 88]. For our study, we will be using only the road traffic and weather data for the period of eight months from February 2014 to September 2014. Table 2 shows the description of parameters in the CityPulse road traffic data. It contains data collected from two linked sensors connecting two streets in the Danish city of Aarhus. Other figures and tables display additional information and data about these observation points. LSTW is a national dataset that includes information on weather and traffic conditions in the United States, including traffic incidents (e.g., accidents and construction) and weather events (e.g., rain, snow, and storms). As of 2021, it contains approximately 37 million records of weather and traffic-related occurrences since August 2016. Figure 6 shows a map of Arhus indicating two observation points from street Arhusvej72 to Arhusvej 0.

4. Experimental Setup

In the experimental setup, TensorFlow and necessary packages were installed on Google Colab. The dataset containing both traffic flow and weather data was then uploaded and preprocessed using the Pandas library. The model was built using PyTorch, a high-level API for TensorFlow, with a combination of BiLSTM and GNN layers. For the training of the proposed model, the following hyperparameters were used: batch size of 32, L2 regularization of 0.01 for both time series GNN and LSTM layers, Adam optimizer, 200 epochs, and a learning rate of 0.01. Additionally, a search space was defined for hyperparameter tuning, which included varying batch sizes, learning rates, regularization strengths, number of hidden units in LSTM and GNN layers, and number of attention heads in the model. In our experiments, we used a graph convolutional layer with 64 nodes and a 2-layer BiLSTM with 128 hidden units. We found that these hyperparameters resulted in a good trade-off between model complexity and predictive accuracy. In particular, we used multihead attention in our model, which allows for attention to be computed across multiple feature maps and has been shown to be effective in modelling spatial dependencies in graph data. The best-performing hyperparameters were chosen based on validation accuracy, and the chosen values were used for final model training and testing. Furthermore, data normalization was performed on both input features and target variables, which is a common practice for time series data. The training and testing data for this study were chosen based on a train-test split. The dataset was randomly divided into a training set and a testing set, with a ratio of 80 : 20. The training set was used to train the model, while the testing set was used to evaluate the performance of the model on unseen data. The model was then used to make predictions on the test data, and the performance was evaluated using metrics such as mean squared error, mean absolute error, and R-squared. This approach can be used to improve the accuracy of traffic flow forecasting by incorporating correlated weather data.

4.1. Proposed Model

Our proposed model aims to accurately forecast traffic flow by utilizing a combination of BiLSTM, GNN, and attention mechanisms as shown in Figure 7. The model takes into account both the temporal dependencies in the traffic flow and weather data as well as the spatial relationships between the traffic sensors. The use of attention layers allows the model to weigh the importance of each feature in the input data and improve the accuracy of the final prediction. The main steps are defined as follows:(1)An attention layer weighs the importance of each feature in the input traffic flow and weather data. This attention layer can be implemented using the attention mechanism from TensorFlow.(2)A BiLSTM layer processes the sequential traffic flow and weather data. The BiLSTM will capture long-term dependencies in the data.(3)Another attention layer weighs the importance of each feature in the spatial location of the traffic sensors. It can be used to focus on specific parts of the input data and graph structure, allowing the GNN to learn more relevant spatial relationships, for example, to weigh the importance of each sensor in the input data matrix, based on factors such as the traffic flow and weather data.(4)A GNN layer processes the spatial relationships between the traffic sensors.(5)A fully connected layer takes the output of the GNN as input and produces the final prediction.

This architecture allows the model to weigh the importance of each feature in the traffic flow and weather data as well as the spatial location of the traffic sensors, which can improve the accuracy of the final prediction. A sample of information stored at nodes and edges can be visualized in Table 3. Table 4 explains the graph represented in Figure 8, where the nodes represent the sensor locations and edges represent the connection between two sensor points.

4.2. Data Preparation for Hybrid Model

The data required for the model training was prepared by merging the weather data and the road traffic data. For example, the road traffic data for the month of February to June and August to September were copied in a single .csv file. Table 5 shows the first few entries of the processed dataset for our model. The merged file contained more than 9000 k rows of data for the month of February to September for any two observation points at a particular. Figures 9(a)9(c) show the visualisations generated from our processed dataset for the vehicle count from 14 February to 16 February 2014.Additionally, the flow pattern on weekdays and weekends was compared between various sensors that are situated on various road segments. One example comparison is shown in Figures 10(a)10(d), which contrasts traffic flow on Sunday with Thursday on four different road segments, including a road connection of two streets of Ãrhusvej (sensor ID: 158324), Nordre Ringgade to Randersvej (sensor Id: 187695), Ãrhusvej Ãstjyske to Motorvej (sensor ID: 158355), and Edwin RahrsVej to Anelystvej (sensor ID: 197274) where the last road segments connect two cities, i.e., Aarhus and Tilst. The graph used in this study was constructed using the CityPulse traffic and weather data. Each row of the merged dataset was considered as a node in the graph, and the edges were formed based on the pairwise Euclidean distances between the nodes. The edge weights were calculated using the traffic data (vehicle count and average speed) and traffic metadata (distance in meters and report ID). The node features were obtained from the weather data (humidity, dew point, and wind speed). The process of constructing the graph can be described using the following equations.

4.3. Node Feature Matrix

Let X be the node feature matrix of size , where N is the number of nodes and D is the number of features. Each row of X corresponds to a node, and each column corresponds to a feature. In this study, X was constructed using the weather data as follows:

4.4. Pairwise Distance Matrix

Let D be the pairwise distance matrix of size , where N is the number of nodes. The element represents the Euclidean distance between node and node . In this study, was constructed as follows:

4.5. Edge Weight Matrix

Let W be the edge weight matrix of size , where is the number of nodes. The element represents the weight of the edge between node and node . In this study, was constructed using the traffic data and metadata as follows:where is the number of vehicles between node and node , is the average speed between node and node , is the distance in meters between node i and node j, and is the report ID of the traffic data between node and node . By constructing the graph in this manner, we were able to incorporate both the traffic and weather data into our GNN model, which allowed us to predict traffic flow with high accuracy.

5. Results and Discussion

Two studies, GMAN [90] and STSGCN [91], utilize graph convolutional networks and multihead attention mechanisms to predict traffic flow. GMAN takes traffic sensor data as input and predicts traffic speeds at future time steps, while STSGCN uses spatiotemporal traffic data to make predictions. Both models outperform several baseline traffic datasets, demonstrating the effectiveness of graph convolutional networks for traffic forecasting. However, neither GMAN nor STSGCN incorporates weather data into their models, making direct comparisons with our model inappropriate. Nevertheless, we compare our suggested model with established techniques and representative techniques based on BiLSTM and GNN to showcase its efficacy. The following is a brief summary of the baselines.(i)TFFNet: It simply creates a cubic spatiotemporal trajectory by dividing and matching the GPS trajectory data from each day’s relevant locations. By integrating the sampling from each cube slice, a path is produced. A spatial grid is made by connecting each of the routes. The graph shows the volume of traffic in each grid cell over a 15-minute period. The model is trained using the Wuhan traffic dataset using a deep convolution neural network based on residual network architecture.(ii)Dynamic GRCNN: They predict the movement of people in city traffic. They created incidence dynamic graph structures to replicate the traffic linkages from historical passenger movements among stations and used the SubwayBJ, BusBJ, and TaxiBJ datasets for training their model based on the LSTM and graph convolution network.(iii)trafficBERT: They constructed a model of the transformer by stacking numerous layers of encoders in order to preserve the BERT properties. After that, by combining all the data, they were able to get the model to comprehend the full traffic flow. Their model used the METR-LA, PeMS-L, and PeMS-Bay datasets to anticipate traffic volume using a transformer-based BERT algorithm.(iv)ST-TrafficNet: They suggested a novel multidiffusion convolution blocks made up of attentive and bidirectional convolution for capturing spatial interactions. High-dimensional temporal data are kept in layered long short-term memory (LSTM) blocks. They employed LSTM with multidiffusion convolution blocks to extract and forecast spatiotemporal characteristics.(v)ST-GNN: To more efficiently incorporate information on traffic flows from surrounding roads, a layer of a GNN with a position-specific focus mechanism was used. They combine an RNN with a transformer layer in order to capture the local and global temporal dependencies. They used GCNN with gated recurrent units to extract features and forecast.(vi)M-B-LSTM: Due to the usage of attention mechanisms in their work, they were included in our comparison. A focus mechanism was used to draw attention to more crucial information. By straightforward LSTM guided with attention models, their model produced good prediction results.(vii)DHSTNet GCN-DHSTNet: Their approach took into account spatiotemporal dependencies as well as other external elements like the state of the roads. They divide the citywide traffic crowds’ temporal features into three main categories, each of which includes a recent, a daily, and a weekly component. Their model successfully predicted both the NYC bike data and the Taxi Beijing dataset using both CNN with LSTM and CNN, LSTM with GCNN.

We evaluated the accuracy of our approach using the root mean square error (RMSE), which is provided in equation (1). For each model, we calculated the discrepancy between the projected and actual traffic count amounts in order to be straightforward and reasonable. The prediction effect is improved with a low RMSE value. Table 7 contrasts the RMSE values of our model with those of the reference models. Figure 11 shows the ground-truth and predicted daily traffic passenger flows of road segment “158324”(Arhusvej) for one day.

Figure 11 shows the prediction results compared to the ground truth after training for one sensor 158321 from Feb to June using the CityPulse dataset. The ST-GNN model and the GNN-DHSTNet model were the next lowest RMSE models, according to Table 7, which also reveals that our model with BiLSTM and GNN with attention mechanism had the lowest RMSE. Therefore, it can be inferred that the addition of the BiLSTM network to extract the temporal dependencies while preserving some external parameters such as dew point and air pressure has enhanced the overall prediction performance. One explanation for the results being different from those of the other models could be that our model used data from the variable road segments as input to obtain spatial dependencies from the graph neural network, as opposed to the other model, the GNN-DHSTNet, which used a 32 × 32 fixed grid size for building the graph representation. Another evaluation metric taken was mean absolute percentage error (MAPE). This metric takes the difference between the ground-truth values with the forecast values. A forecast is deemed to be of acceptable accuracy when the MAPE value is low, usually less than 5%. The calculating method for MAPE is shown in equation (6). Table 6 shows the MAPE of our model compared with some of the other baseline models.

Figure 12 compares the evaluation metric results on both datasets. Min-max normalisation, also known as feature scaling, was used on both datasets to conduct a linear transformation on the raw data. Using this method, all scaled data within the range are obtained (0, 1). The city pulse dataset performs somewhat better, as can be seen from the graph because it contains linking road ids as segments, which greatly aided in the construction of the graph neural network.

6. Conclusion

In this article, we have put forth a hybrid model for spatial-temporal traffic flow forecasting on city roads. The suggested model incorporates a graph neural network with mechanisms for extracting spatial characteristics from various road segments while also paying attention to environmental variables such as percentage of dew, air pressure, and wind speed. Using the cityPulse and LSTW road traffic and weather datasets, a BiLSTM network with an attention mechanism has been proposed for the prediction while maintaining the temporal dependencies. The suggested approach is more suited for predicting the monthly traffic patterns in transportation hubs along significant road segments. Results show that our model has an MSE value of 6.309, an MAE value of 2.256, and an RMSE value of 2.511. Dew, humidity, and wind speed are the only three weather factors the model currently takes into account. Nevertheless, the dataset also contains numerous additional meteorological condition data, such as temperature, pressure, and wind direction. Another limitation of our study is that we do not account for the trajectory features during training due to the model’s increasing complexity. In future work, we plan to conduct sensitivity and scalability experiments to explore the optimal values of input parameters and investigate the performance of the proposed method as the sizes of training and test sets change. Additionally, in the future, we would like to broaden the scope of our work to incorporate other factors affecting the traffic flow, such as festivals and accidents as these and many other factors also affect how much traffic will be present on the roads. Additionally, we intend to use ensemble forecasting to study the issue in the future.

Data Availability

The data supporting the current study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank the Deanship of Scientific Research at Majmaah University for supporting this work under Project number R-2023-518.