Abstract

This paper proposes a method for estimating the speed and position of unsampled vehicles using sampled data from connected automated vehicles (CAVs). The determination of vehicle speed and position on the road is a challenging and crucial task, as they can effectively reflect traffic flow characteristics and contribute to traffic state estimation and intersection signal timing optimization. Connected automated vehicles have the capability to upload their own trajectory data while also capturing trajectory data of surrounding vehicles through onboard sensors. Therefore, this paper proposes a novel approach to estimate the speed and position of unsampled vehicles. Firstly, using real vehicle trajectory data, the correlation between the velocity of following vehicles and the velocity of leading vehicles under different densities is analyzed, leading to the development of a velocity estimation model incorporating a speed correction factor. Secondly, the correlation between time headway, the rate of change of following vehicle acceleration, and traffic density is examined. To address the issue of heterogeneous behavior in vehicle following described by the Intelligent Driver Model (IDM), a real-time optimization model for estimating vehicle position by optimizing IDM parameters is proposed. The velocity estimation model and the position estimation model are summarized as two nonlinear optimization problems. Finally, the proposed method is validated using actual vehicle trajectory data. Experimental results demonstrate that when the number of connected automated vehicles (CAVs) is 2, the proposed method reduces the average absolute error by 30.73% and the standard deviation of the average absolute error by 42.8% compared to a linear model-based speed estimation method under different density conditions. Compared to a method that estimates vehicle position by calibrating desired gaps, the proposed method reduces the average absolute error by 38.2% and the standard deviation of the average absolute error by 41.7% under different density conditions. Furthermore, the proposed method exhibits good practicality under different CAV penetration rates.

1. Introduction

Vehicle trajectory information accurately reflects the spatiotemporal evolution characteristics of traffic flow and has been widely used by many scholars to estimate queue length [1, 2], travel time [3], traffic delay [4], and traffic volume [5]. It also helps optimize signal timing at intersections and evaluate control strategies [6], providing a solid foundation for intelligent management and control of traffic systems.

Currently, there are two main ways to obtain vehicle trajectory data: fixed position sensors and mobile sensors. Fixed sensors include video cameras, loop detectors, and microwave radar detectors. They can only detect limited data of fixed sections at fixed times, including traffic flow and vehicle types. However, the detection range of these sensors is small and the missed detection rate is high, making it impossible to record detailed vehicle spatiotemporal information [7]. Moreover, due to their high installation and maintenance costs and the high coverage of urban road networks, these fixed detectors are difficult to cover the entire urban road network [8].

Therefore, the development of connected and automated vehicle (CAV) technologies have created the possibility of solving the above problems. CAVs can provide safer, more environmentally friendly, more energy-efficient, and more convenient travel methods and comprehensive solutions. They are believed to have the potential to improve traffic efficiency, reduce energy consumption, and reduce the occurrence of traffic accidents. CAVs are vehicles that achieve assisted driving or even autonomous driving without human intervention, using advanced sensing technology [9]. Equipped with advanced sensors and various wireless technologies, it can not only share its own trajectory data with other CAVs and upload it to the Roadside Unit (RSU) but also acquire trajectory data from human-driven vehicles (HVs) without sensors within its surrounding detection range. Therefore, CAVs can be considered a new type of mobile sensor and can be used to collect spatiotemporal information of other vehicles, thus providing comprehensive vehicle trajectory data [10]. However, CAVs will not appear on daily roads in a highly penetrated form in the next few decades. Even if CAVs can provide complete trajectory data of vehicles within their sampling range, the sparse spatiotemporal data are caused by the low penetration rate.

Wireless transmission of data for traffic management and control has been a hot research topic in recent years. Most of the work has focused on using wireless transmission of data for traffic state estimation and single-vehicle position estimation, rather than position estimation of unsampled vehicles in a queue. Bar-Gera et al. [10] estimated vehicle speed and travel time based on triangulation from signal towers. The literature [1113] proposed linear state estimation algorithms based on Kalman filtering to estimate the position of a single vehicle using GPS location data. The literature [14, 15] used particle filtering algorithms to estimate vehicle positions. Liu et al. [16] improved the accuracy of single-vehicle position estimation by using high-resolution data from 5G base stations. Although the above methods can estimate the position of a single vehicle well, weak signals in remote mountainous areas, tunnels, and overlapping overpasses can easily cause GPS and 5G signal loss, making it impossible to estimate the position of a single vehicle using wireless transmission of data. Qin et al. [17] used passive RFID tags deployed on the road to perform real-time position estimation of a single vehicle, but this method relies too heavily on special road infrastructure and communication equipment. Therefore, determining the state of unsampled vehicles based on partially sparse sampled data is the key to solving the above problem.

In recent years, with the development of CAV technologies, it has become a trend to estimate the state of unsampled vehicles based on partially sampled data. Numerous studies, such as literature [18], estimate the travel time of this type of vehicle based on partial CAVs passing through the intersection, assessing traffic delays. Feng et al. [19] reconstructed vehicle trajectories in the road network using Particle Filter theory and five correction factors, including path consistency, travel time consistency, flow model, etc., based on data from Automatic Vehicle Identification (AVI) and traditional detectors. Chen et al. [20] proposed a method for reconstructing highway vehicle trajectories based on CAV sampling data using the Intelligent Driver Model (IDM) and introducing expected headway calibration factors. Some scholars also use a data-driven approach to complete unsampled data by analyzing the correlation among sampled data. Lint et al. [21] estimated vehicle speed using a State-Related Filter. Ji [22] used Long Short-Term Memory (LSTM) to train on NGSIM data and estimated the speed and position of vehicles closely following the front vehicle on highways. Nanthawichit et al. [23] estimated the travel time of vehicles through the fusion of data from mobile sensors and fixed-point detectors. However, when the sampled data are sparse, data-driven methods (such as LSTM) cannot obtain sufficient data for training, which can lead to anomalous driving behavior in estimating the state of unsampled vehicles, such as negative estimated speed or overlapping vehicle positions.

In order to address the issue of sparse sampling data affecting the estimation accuracy of data-driven methods, using a car-following model to estimate vehicle positions can fully consider the interdependence between the leading and following vehicles [24]. Goodall et al. [25] utilized GPS data from highway vehicles and developed a car-following position estimation algorithm based on the Wiedemann model with preset parameters. By comparing the actual acceleration of vehicles with the expected acceleration calculated based on the Wiedemann model, it was determined that there were undetected manually driven vehicles between two CAVs when the difference in acceleration exceeded a certain threshold [26]. Yao et al. [27] employed the IDM to optimize the insertion position of human-driven vehicles, with the primary objective of minimizing the mean squared error between the actual acceleration of the leading vehicle and the estimated acceleration. This method, based on car-following models, not only addresses the limitations of data-driven approaches but also significantly enhances the accuracy of estimating the position of artificially driven vehicles.

In general, current research on vehicle position and velocity estimation primarily falls into two categories: data-driven methods and model-based methods. Data-driven approaches face a major challenge stemming from the sparsity of sampled data, resulting in insufficient training data. On the other hand, while model-based methods for vehicle position estimation using car-following models address issues related to vehicles deviating significantly from these models, they also encounter accuracy challenges when dealing with sparsely sampled data. One significant reason for this is that existing research often utilizes predefined parameter models (e.g., Wiedemann and IDM) for vehicle estimation, disregarding the heterogeneity within traffic flow. For instance, car-following behaviors exhibit variability under different traffic densities, even within the same vehicle platoon, as local density fluctuations lead to variations in car-following behavior.

Considering the limitations of prior research, this paper presents a model-based method for estimating the speed and position of human-driven vehicles. Unlike conventional estimation methods that rely on fixed-parameter models, this study formulates the selection of parameters for vehicle speed and position estimation models as two optimization problems. The objective is to utilize real-time traffic data from connected and autonomous vehicles (CAVs) to adjust model parameters to better align with the current traffic conditions, thus reducing estimation errors. The results demonstrate that this proposed method exhibits strong estimation performance, even in environments with low CAV penetration rates. The major contributions are as follows:(1)In response to the issue of low estimation accuracy when using fixed-parameter models for vehicle speed and position estimation, this study, based on real-world traffic data, analyzes the model parameters that influence the precision of estimating the speed and position of human-driven vehicles.(2)Considering the influence of unstable car-following behavior, based on the analysis of model parameters that impact estimation performance, a method is proposed to optimize the speed estimation model and corresponding parameters of the IDM using CAV detection data, thereby enhancing the precision of speed and position estimation for human-driven vehicles.(3)Under various CAV penetration rates and different traffic densities, a comprehensive evaluation of the proposed method was conducted using real-world traffic scenario data. The results demonstrate that the proposed method consistently outperforms other baseline models, leading to significant improvements in both speed and position estimation accuracy as well as stability.

The rest of this paper is organized as follows. Section 2 introduces the research problem. Section 3 describes the speed and position estimation modeling method. Section 4 validates the proposed method using actual data. Section 5 concludes the paper and proposes future research directions.

2. Representative Scenario

First, we provide a representative scenario to explain our work, as shown in Figure 1. For modeling simplicity, we only focus on the car-following process without lane changing. The mixed traffic flow studied in this paper consists of two types of vehicles: human-driven vehicles (HVs) and connected and automated vehicles (CAVs). We assume that each CAV is equipped with a Mobile Object Detection and Tracking System (MODAT) to detect the surrounding traffic conditions [24]. CAVs are equipped with various sensors, including GPS, stereo vision cameras, and LiDAR. LiDAR is used to acquire point cloud data in the surrounding space, and the data from stereo vision cameras and GPS are fused to obtain real-time positions of surrounding vehicles. The acquired data include vehicle ID, timestamp, and location. Referencing to [10, 20], the detection range for CAVs is set at 100 m. CAVs use their communication capabilities to upload this information to Roadside Units (RSUs). RSUs are equipped with data processing systems and computing units. The data processing system can generate high-resolution trajectory maps (speed, position, and acceleration) for CAVs and vehicles within the detection range. The computing unit estimates the speed and position information of undetected human-driven vehicles based on velocity and position estimation models and sends this information to all CAVs to assist them in evaluating the surrounding traffic conditions.

3. Methodology

Figure 2 illustrates a typical following scenario. In the following model, it is assumed that the driving behavior of the following vehicle is influenced by the changes in the motion of the leading vehicle, which is reflected in the speed, position, and acceleration of the following vehicle. Therefore, the following model can be used to define the behavior of the following vehicle under the influence of the leading vehicle. When the following vehicle significantly deviates from the expected behavior defined by the following model, it indicates that the following vehicle is influenced by other vehicles [26], and therefore, the position of these vehicles needs to be estimated.

3.1. Acceleration Estimation

In 2002, Helbing et al. [28]. proposed the Intelligent Driver Model (IDM) based on empirical observations, which differs from most existing car-following models that separate free-flow and congested states. The IDM provides a unified framework to describe various states of vehicles, ranging from free flow to complete congestion, with a concise set of interpretable parameters. Furthermore, the IDM has been widely utilized to capture the effects of driving behavior between CAVs and HVs [7, 20, 24]. Therefore, for the sake of modeling convenience, this study assumes that the interactions between HVs and CAVs, as well as between HVs, conform to the IDM. The IDM takes into account both the position and velocity differences between the leading and following vehicles, and a typical IDM is shown in the following equation:where represents vehicle , which is the following vehicle in a group of adjacent cars; is the maximum acceleration; represents the velocity of the following vehicle; represents the desired velocity of the vehicle under free-flow conditions; represents the current time; is the acceleration exponent; and represents the actual distance between two vehicles, defined as , in which represents the position of the leading vehicle , represents the position of the following vehicle , and represents the length of the vehicle. represents the desired distance between two vehicles, which is expressed as a function of the velocities of both vehicles in the following equation:where represents the minimum safe distance between two cars; represents the safe time headway between the following car and the leading car; and represents the comfortable deceleration.

3.2. Checking for the Existence of Unsampled HVs

Figure 3 shows a typical following scenario in a mixed traffic flow. The first HV within the detection range of CAV2 is designated as the following vehicle , and the last vehicle within the detection range of CAV1 is designated as the leading vehicle . Assuming that there are two vehicles on the road, CAV1 and CAV2, with following within the detection range and responding based on ’s behavior, at time t, if the difference between the theoretical acceleration of calculated by IDM and the actual acceleration exceeds a preset threshold value , it is assumed that there exist unsampled HVs in the detection blind zone and their traffic parameters need to be estimated. For the selection of threshold value , we refer to the study by Goodall et al. [26], who calibrated the NGSIM data in the United States and found that when , it can describe various actual situations. The theoretical acceleration of is calculated by the following equation:

Therefore, the difference between theoretical and actual acceleration can be calculated by the following equation:

After determining the existence of unsampled HVs between CAV1 and CAV2, it is necessary to estimate the positions of these unsampled vehicles. At any time, the HVs whose positions need to be estimated must satisfy the minimum safe distance between and . We denote the i-th vehicle inserted in front of as , and therefore, the position of , denoted as , must be kept within the safe distance range between and , as shown in the following equation:where represents the minimum safe distance between and .

3.3. Estimation of the Lead Vehicle Speed

As shown in Figure 4, we assume that is following the first human-driven vehicle HV1 in the detection blind zone. The acceleration of the following vehicle is affected by its current speed and the speed of a leading vehicle HV1.

Therefore, the following conditions hold:where represents the estimated speed of the leading vehicle . The coefficient was calibrated by Goodall et al. [27] in 2013 to be 0.162 based on NGSIM data. Using a preset parameter model to estimate the speed of the leading vehicle in free flow and stable following situations may work well. However, when the following vehicle is sufficiently close to the leading vehicle, the randomness of the following vehicle’s acceleration increases, and a fixed cannot accurately capture the effect of the leading vehicle’s speed changes on the following vehicle’s acceleration. To verify this idea, we randomly extracted trajectory data of 1000 vehicle convoys for each density between 30 and 60 veh/km at intervals of 10 veh/km, considering the majority of density scenarios in actual road conditions using the NGSIM data collected on the US-101 highway. We analyzed the effect of different densities on the value of , and the calibration formula used is shown in the following equation:

In Figure 5, the change trend of the value under densities of 30–60 veh/km is shown, and with the increase of the number of selected vehicle platoons, the values under different densities finally converge to different values. Table 1 provides the statistical values of under different densities. It can be found that when the traffic density is between 30 and 50 veh/km, the absolute value of the average of coefficient increases with the growing density. This can be easily explained because as the traffic density increases, the headway distance between vehicles becomes smaller, and the impact of the preceding vehicle on the following vehicle becomes more severe. When the traffic flow enters the range of 50–60 veh/km, becomes smaller. At this point, the traffic flow gradually transitions from free flow to congestion, and most vehicles in the platoon will follow the preceding vehicle’s speed closely. The influence of the preceding vehicle’s velocity change on the following vehicle’s velocity and acceleration gradually decreases. Therefore, it can be verified that the degree to which the acceleration of the following vehicle is affected by its own current speed and the preceding vehicle’s velocity depends on the traffic density. Using the preset value method to estimate the velocity of the leading vehicle cannot well reflect the real traffic situation.

To address this issue, this study incorporates an optimization factor into the model as shown in equation (8), where is calculated through optimization based on the detection data within the CAV detection range. In addition, it is considered that the estimated speed of the lead vehicle should not exceed the free-flow speed, as shown in equation (9). Based on these considerations, a method is proposed to use real-time CAV sample data to optimize the value of (the optimization method is introduced later and is also applied to improve the IDM) in order to improve the accuracy of HV speed estimation.

Once the speed of HV1 immediately leading the vehicle has been determined, it is reasonable to assume that the speeds of other HVs ahead of HV1 and between HV1 and the vehicle can be estimated as the average of the speeds of HV1 and , considering that in stable traffic flow, the speed difference between the leading and following vehicles is small. The speed is calculated by the following equation:

3.4. Estimation of HV Position

The essence of using a car-following model to estimate the position of the lead vehicle is to calculate the headway between the following vehicle and the leading vehicle based on the following vehicle’s speed, position, and acceleration. Solving and transforming equation (1) yields equation (11), and substituting equation (2) into equation (11) results in equation (12).

When the traffic flow reaches a stable state, it can be assumed that the following vehicle will follow the leading vehicle at a speed close enough to the front vehicle and the leading vehicle’s acceleration is 0. Therefore, approaches 0 infinitely, and equation (2) can be simplified to equation (13). Equation (10) can be transformed to equation (14), and the estimated position of the leading vehicle can be calculated by equation (15).

Based on equations (1)–(15), we can estimate the position and velocity of unsampled HVs in the detection blind zone at any time. When an HV is inserted into the undetected range, it can be treated as a known vehicle within the CAV detection range, and its information can be used to continue estimating the position of the leading vehicle. The entire estimation process is repeated until it exceeds the predetermined estimation interval. To verify the effectiveness of using the IDM with preset parameters to estimate the position of unsampled HVs, we evaluate the mean absolute error between the estimated position of the HV (denoted by ) and its actual position (denoted by ) by the following equation:where represents the number of HVs to be inserted between and .

However, when the traffic density is moderate, using equations (1)–(16) may not be a problem, as the mutual influence between vehicles remains stable and the following car can maintain a stable following rule with the leading car. When the traffic density is low, the mutual influence between vehicles is weak, and the IDM with preset parameters may not accurately estimate the headway distance between two cars. On the other hand, as the traffic density increases, the following car may approach the leading car with a smaller safe headway distance. At this time, the randomness of the influence of the leading car on the following car increases. For example, if the leading car suddenly decelerates, its influence on the following car mainly manifests in the significant oscillation of its acceleration.

In the IDM, is the acceleration exponent, which characterizes how the acceleration decreases. A larger means a higher rate of change of the acceleration of the following vehicle, and in previous studies [28], it was usually set to 4. To verify the correlation between the acceleration exponent and traffic density, we extracted the trajectory data of four different traffic density scenarios with intervals of 10 veh/km between 30 and 60 veh/km from the NGSIM dataset and specified the first and last vehicles in the convoy as CAV1 and CAV2, respectively. Based on previous studies, we first set the safe time headway T in the IDM as a constant (e.g., 1.98°s) and selected different values of to estimate the position of HV1 using equations (1)–(15). The changes in MAE under different values are shown in Figure 6. It can be seen that the MAE of estimating vehicle positions increases with increasing at a density of 30 veh/km, and at a density of 40 veh/km, the MAE reaches a minimum value at a value of 1.6. This indicates that when the traffic density is moderate, the mutual influence between vehicles is weak, and the rate of change of the acceleration of the following vehicle is less affected by the preceding vehicle and remains relatively stable. At traffic densities between 50 veh/km and 60 veh/km, the MAE of estimating vehicle positions decreases with increasing , indicating that as traffic density increases, the degree of mutual influence between vehicles increases, and the rate of change of the acceleration of the following vehicle increases accordingly, especially when traffic density is between congested flow and free flow.

In the IDM, parameter represents the safe time headway between the following vehicle and the leading vehicle. The IDM assumes that the vehicle maintains a fixed safe headway when following the leading vehicle, which is obviously unrealistic. Similar to the analysis method of the value in the IDM, based on the NGSIM data, we extracted 1000 different platoon trajectory data in the density range of 30–60 veh/km with an interval of 10 veh/km, aiming to analyze the evolution of the average headway of platoon under different densities. The headway calculation formula used is shown in equation (17), where represents the time headway between the following vehicle and the leading vehicle . Figure 7 shows the change of the average headway of platoon under different densities, and Table 2 provides statistical values for the average headway of platoons under different density conditions. It is evident that under density conditions ranging from 30 veh/km to 60 veh/km, the average platoon time headway decreases as traffic density increases. This conforms to the fundamental characteristics of traffic flow. Figure 7 illustrates the variations in time headway under different density conditions. It can be observed that, when density is held constant, the average time headway within different platoons exhibits fluctuations due to the heterogeneity in following behaviors. For instance, within the same vehicle platoon, variations in driving behaviors among different drivers result in these fluctuations. Cautious drivers tend to maintain a larger gap behind the leading vehicle during the following process, while more aggressive drivers tend to closely tail the leading vehicle. Thus, even within the same platoon, differences in driving behavior can lead to local changes in traffic density.

Therefore, we can reasonably assume that the acceleration index and the safe time headway are related to traffic density. Previous studies have assumed them as fixed values, which differs from the randomness of vehicle behavior found in the above experiments. Therefore, before estimating the position of the preceding vehicle using the IDM, we introduce correction factors and in the formula for to, respectively, correct the acceleration exponent and headway based on the sampling data obtained within the CAV detection range. The aim is to obtain IDM parameters that are more in line with the actual situation of the current vehicle fleet. The correction process, similar to the previous speed correction factor , will be discussed in detail in the following. Therefore, equation (12) is transformed into equations (14) and (18) is transformed into equation (19):

The flowchart of the proposed vehicle position estimation method is shown in Figure 8. This method includes four parts: determining the difference between expected and actual behavior, optimizing model parameters, estimating traffic parameters, and estimating vehicle position. The specific details will be discussed in the following sections.

3.5. The Calculation Procedure for the Correction Factors (, , )

In this section, we will provide a detailed explanation of how to calculate the three correction factors using the platoon data extracted from the NGSIM dataset. Assuming that the platoon scenario extracted is as shown in Figure 9, when any vehicle in the platoon is assumed to be a CAV, as depicted in Figure 10, there are a total of vehicles within its detection range, including the CAV itself, all sequentially numbered as ; with the aid of various CAV sensors, information about the vehicles (including speed, position, and acceleration) can be acquired. Let be defined as the group of leading and following vehicles within these vehicles (e.g., ). As a result, the platoon comprises a total of vehicle combinations (). Taking as an example, based on the information of (velocity, position, and acceleration), it is possible to estimate the information of and then compare it with the information of . The process can be repeated for vehicle combinations to vehicle combinations .

3.5.1. Speed Correction Factor

When optimizing the value of , it is necessary to impose constraints on the range of . In order to ensure that the estimated speed of does not exceed the free-flow speed and is not negative, is constrained as follows:

Substituting (20) into (21), is constrained as follows:

Taking vehicle combination as an example, assuming that the real speed, position, and acceleration of and obtained are and , by substituting the speed and acceleration of into equation (8), the estimated speed of vehicle can be calculated. While varying within the feasible range leads to different results for , when the estimated speed of vehicle is closest to its real speed , the corresponding value of is the result obtained using vehicle combination . Expanding this calculation process to all vehicle combinations in the entire platoon, in all vehicle combinations (, ,), the leading vehicles are , respectively, each with different values of within the feasible range, and their respective estimated speeds also change. When the overall error in the estimated speeds of all vehicles is minimized, the corresponding is the final result after the calculations are completed.

Thus, an objective function can be constructed to solve for the optimal speed correction factor , taking into consideration all vehicle combinations () within the detection range that can be used for calculations. The vehicle in the platoon used for estimation is denoted as , and the optimization objective is to minimize the root mean square error between 's real speed and estimated speed, as shown in the following equation:

3.5.2. Time Headway Correction Factor and Acceleration Exponent Correction Factor

When adjusting the value of the correction factor , it is essential to take into account the time headway situation in the actual traffic scenario. Therefore, the values of the integral term should satisfy the following equation:where and represent the minimum and maximum time headways, respectively. Based on equation (5), after calibrating the extracted NGSIM data, we obtain  = 0.8°s and  = 5°s. Equation (22) can be rearranged to obtain the range of values for , as shown in the following equation:

Referring to Helbing et al. [28], . The integral term should satisfy ; during the optimization process, we set to a fixed value of 5, and is constrained as follows:

Taking vehicle combination as an example, by substituting the speed , position , and acceleration of into equation (19), the estimated position of vehicle can be calculated. While varying and within the feasible range leads to different results for , when the estimated position of vehicle is closest to its real positon , the corresponding value of and is the result obtained using vehicle combination . Expanding this calculation process to all vehicle combinations in the entire platoon, in all vehicle combinations (, ,), the leading vehicles are , respectively, each with different values of and within the feasible range, and their respective estimated positions also change. When the overall error in the estimated positons of all vehicles is minimized, the corresponding value of and is the final result after the calculations are completed.

Thus, an objective function can be constructed to solve for the optimal speed correction factor , taking into consideration all vehicle combinations () within the detection range that can be used for calculations. The vehicle in the platoon used for estimation is denoted as , and the optimization objective is to minimize the root mean square error between 's real speed and estimated speed, as shown in the following equation:

In summary, the determination of the three correction factors has been summarized into two nonlinear optimization problems. Based on equations (1)–(26), we are able to estimate the information of the HVs in the detection blind zone between CAV1 and CAV2, including position and speed. Once a vehicle is estimated, it is treated as a known vehicle and used to continue estimating the positions of the HVs in lead. The position estimation stops when the position estimation interval is exceeded, and the complete convoy position information is obtained. To better describe the entire position estimation process, we provide Algorithm 1 as an explanation.

(1)%Step 1: Obtain the speed, position, and acceleration information of the vehicles within the detection range of CAV1 and CAV2.
(2)Vehicle information:
(3)Speed information:
(4)Position information:
(5)Acceleration information:
(6)%Step 2: Calculate the speed correction factor, acceleration index correction factor, and headway distance correction factorat the current time.
(7)While
(8)  Optimization Correction Factors :
(9)   
(10)  Optimization Correction Factors and :
(11)   
(12)Output:
(13)%Step 3: Estimating the position of the HV
(14)Determine if there are any undetected HVs that need to be estimated:
(15)Calculate the theoretical acceleration of:
(16)
(17)If:
(18)%There are HVs that have not been detected and need to be estimated.
(19)While:
(20)  %Calculate the velocity of the HVs within the undetected range
(21)   
(22)  %Calculating the vehicle spacing
(23)   
(24)  %Calculate the position of the undetected HV.
(25)   
(26)Else:
(27)  Continue
(28)Output:

4. NGSIM Data-Based Experiments

4.1. Experimental Setup

Firstly, we set up various experimental scenarios involving different densities and penetration rates, as well as nonideal conditions more in line with real traffic scenarios, to evaluate the performance of the proposed method.

Many scholars choose NGSIM (the Next-Generation Simulation) data as their data source when researching real vehicle trajectories. These data encompass the southbound lanes of US-101, the eastbound lanes of California’s Emeryville I-80, Peachtree Street in Atlanta, Georgia, and others. To validate the practicality of the proposed methodology in this study, we selected vehicle trajectory data from the US-101 segment as experimental data to assess the performance of the proposed method and for comparative experiments. The characteristics of the experimental segment are illustrated in Figure 11.

The experiments were conducted under varying traffic densities ranging from 30 veh/km to 65 veh/km, with density intervals of 5 veh/km, as this range covers most real-world traffic scenarios. In total, 400 traffic scenarios were extracted from the experimental data, with 50 scenarios extracted for each density interval. Each scenario represents a continuous queue of vehicles on a lane, with the number of vehicles in the queue corresponding to the density condition. The data include vehicles’ IDs, timestamps, positions, speeds, accelerations, and lane IDs.

4.2. Optimization Problem Calculation

To solve the optimization problem of calculating the correction factors mentioned in Section 3.5, it is essential to solve two sets of nonlinear optimization functions in equations (20) and (21). In this paper, we used the Particle Swarm Optimization (PSO) algorithm to solve such optimization problems, with the following parameters: population size of 100, inertia weight of 0.4, individual weight of 0.7, social weight of 0.9, maximum iteration of 500, minimum particle movement step of 1e − 8, and minimum change value of objective function of 1e − 8. Since the initial positions of the population are different, which can lead to different results for each run, we repeated the optimization process 10 times and selected the correction factors corresponding to the minimum error as the optimization result. We no longer assign specific IDM parameters to CAVs but instead assume a human-like behavior for CAVs. Therefore, both CAVs and HVs utilize the IDM parameters of HVs. The IDM parameters used in the calculations are based on the research by Chen et al. [20] as shown in Table 3. The other experimental parameters are shown in Table 4, and the optimization results of the correction factors for each density are shown in Table 5.

4.3. Performance under Different Densities

In this section, we consider a fixed number of CAVs in the platoon, which is set to 2 since the proposed estimation method requires at least two CAVs. To evaluate the performance of the proposed method under nonideal conditions, we assume that the lead and rear vehicles in the platoon are CAVs, which enables us to estimate more vehicles. Figure 12 shows a typical scenario of estimating the positions of HVs in the unsampled blind zone of the platoon with a fixed number of CAVs. The positions of the vehicles sampled by CAVs are denoted by green dots, and the unsampled vehicles are denoted by red dots. Our goal is to estimate the velocities and positions of the unsampled vehicles based on the observed information. To evaluate the velocity estimation performance of unsampled vehicles, we calculate the mean absolute error between the true velocity and the estimated velocity at the current time for each unsampled vehicle, as shown in equation (26). To evaluate the position estimation performance of unsampled vehicles, we calculate the mean absolute error between the true position and the estimated position for each unsampled vehicle, as shown in the following equation:

Figures 13(a)13(h) present the average absolute errors between the estimated speeds and the true speeds of the vehicles under 50 different traffic scenarios within the density range of 30–65 veh/km for the two methods. Table 6 provides the statistical values of the errors. As shown in the figure, the proposed method has the best speed estimation accuracy. To further verify the stability of the proposed speed estimation method, the standard deviations of the mean absolute errors are also provided in Table 6. A lower value indicates stronger stability of the method. According to the statistical results in Table 6, the stability of the proposed method is better than that of the Goodall method at all densities because the proposed method considers the heterogeneity of the following behavior caused by different traffic densities, which affects the vehicle speed and can better reflect the speed of vehicles in different traffic scenarios, based on the Goodall algorithm.

Figure 14 shows the estimated vehicle positions and the true vehicle positions under 50 different traffic scenarios within the density range of 30–65 veh/km for both the proposed method and the method proposed by Chen, which uses calibrated expected spacing such as (a-1). It also shows the average absolute errors between the estimated positions and the true positions for both methods such as (a-2). The red dashed line represents the true vehicle position, the green dashed line represents the vehicle position estimated by the proposed algorithm, and the blue dashed line represents the vehicle position estimated by Chen’s method. From the subplots, it can be observed that the number of vehicles to be estimated increases with the increase in traffic density. Moreover, under various density conditions, the proposed method achieves the best estimation accuracy.

Table 7 presents the average mean absolute error (MAE) for position estimation of the two methods under different densities. The standard deviation of the average MAE is also provided to assess the stability of the proposed position estimation method. The results demonstrate that the proposed method outperforms Chen’s method in terms of accuracy and stability across various density conditions. The error in vehicle position estimation decreases as the traffic density increases, which is expected. In low-density traffic, the larger vehicle spacing and weaker interaction between vehicles pose challenges for the proposed method to capture detailed driving behavior accurately. However, as the traffic density increases, the reduced vehicle spacing and enhanced following characteristics make it easier for the proposed method to interpret the driving behavior of each vehicle, leading to improved position estimation accuracy. On the other hand, Chen’s method exhibits an increasing MAE from 55 veh/km to 60 veh/km. This can be attributed to Chen’s method primarily considering the impact of traffic density through calibrating the desired headway while overlooking important factors such as variations in vehicle speeds, headways, and acceleration rates of change, which affect the accuracy of position estimation. In summary, in real-world traffic environments, calibrating the desired headway method is more suitable for medium to low traffic densities. However, the proposed method, incorporating three correction factors of speed, time headway, and acceleration rate of change, provides more accurate position estimation for unsampled vehicles in a variety of traffic scenarios.

4.4. Performance under Different CAV Penetration Rates

In this section, the traffic density is set as a fixed constant, and we conducted experiments using data from a traffic density of 60 vehicles/km, which falls between free flow and congested flow. At this density, there are 40 vehicles on the selected US-101 road segment, and considering that the proposed method requires a minimum of 2 CAVs, the minimum required CAV penetration rate should be 5%. Therefore, we set the CAV penetration rates to 5%, 7.5%, 10%, and 12.5% and randomly selected the corresponding number of vehicles in the platoon to be CAVs. The experiments were conducted on 50 randomly selected traffic scenarios using different random seeds.

Figure 15 shows the estimated vehicle positions (represented by red dashed lines) and the actual vehicle positions (represented by green solid lines) for the proposed method under different penetration rates. It is evident that as the penetration rate of CAV increases, the number of human-driven vehicles (HVs) to be estimated decreases. This indicates that the proposed method effectively captures the characteristics of real traffic scenarios.

Figure 16 shows the variation of mean absolute error (MAE) in position estimation between the proposed method (represented by green solid lines) and Chen’s method (represented by blue solid lines) across 50 different convoy scenarios under varying penetration rates. Similarly, Figure 17 shows the MAE variation in speed estimation between the proposed method (represented by green solid lines) and Goodall’s method (represented by red solid lines) under different penetration rates. To provide a comprehensive performance evaluation, the average MAE values for position and velocity estimation were computed and are presented in Table 8. Notably, the results demonstrate a consistent decrease in MAE values for both position and velocity estimation as the penetration rate increases. This can be attributed to the increased availability of HV trajectory information within the CAV sampling range, which contributes to enhanced optimization of the correction factors and subsequently improves the accuracy of velocity and position estimation. Furthermore, when compared to Goodall’s method, the proposed velocity estimation method consistently outperforms it across all penetration rate scenarios. Moreover, the proposed position estimation method significantly reduces the error in HV position estimation compared to Chen’s method.

5. Conclusion and Future Work

This paper presents a method for velocity and position estimation of vehicles based on sampled data from CAVs and its validation using real-world datasets. Firstly, vehicle trajectory data under different densities are extracted from the NGSIM dataset. Based on the model proposed by Goodall for estimating the preceding vehicle speed, the influence of the preceding vehicle’s speed variation on the following vehicle’s acceleration is analyzed under different densities. The results show that this influence increases with increasing traffic density under noncongested conditions but decreases when the density is between congested and free-flowing states. Therefore, a velocity correction factor is introduced based on the Goodall model, and the correction factors are optimized using CAV sampled data, transforming the determination of the correction factors into a nonlinear optimization problem. Next, the variations of average time headway and acceleration rate of change are analyzed under different densities. The experimental results demonstrate that when density is held constant, the average time headway within different platoons exhibits fluctuations due to the heterogeneity in following behaviors, while the acceleration rate of change exhibits higher randomness under low densities. Therefore, time headway correction factors and acceleration exponent correction factors are introduced in IDM, and their values are determined using CAV detection data. The determination of these correction factors is transformed into a nonlinear optimization problem involving two parameters. Then, based on the proposed speed estimation model and position estimation model, the speed and position of unsampled HVs are estimated in a CAV platoon. Finally, the performance of the proposed method is experimentally validated using the NGSIM dataset. The results show that even under the extreme condition where there are only two CAVs in the platoon, serving as the lead and trailing vehicles, the proposed method can more accurately reflect the true speeds of vehicles under different densities compared to the linear model’s method. Furthermore, compared to the method that estimates the positions of unsampled vehicles using calibrated desired headways, the proposed method significantly reduces the estimation errors of undetected vehicle positions. Additionally, the CAV penetration rate has a minimal impact on the estimation results, and in general, the error in position estimation decreases with an increase in the penetration rate.

Although the influence of unstable following behavior has been considered in this study, the range of consideration is limited to the number of vehicles within the detection range of CAV because the optimization process of IDM parameters is based on the vehicle data within the detection range of CAV. In future research, the proposed method can be further extended by considering the time-variant factors in the CAV detection blind zone to reproduce more realistic HV information. However, the proposed estimation method is based on the moment level, which greatly improves the effect of the lane change behaviors of multilane.

Data Availability

The dataset is available at https://data.transportation.gov/Automobiles/Next-Generation-Simulation-NGSIM-Vehicle-Trajector/8ect-6jqj.

Conflicts of Interest

The authors declare that they have no conflicts of interest.