Abstract

Accessibility could reflect people’s travel convenience, cities’ livability and sustainability, and reduce the negative impacts on the environment and safety. It is therefore a key concept in city sustainable development policies across the world. Based on the multi-source data, this paper tries to have an empirical analysis on the urban accessibility of Qingdao city from the angles of spatial linkage strength, job-hosting ratio and distributions of large-scale hospitals and schools. The multisource data includes navigation data, location based service (LBS) data, Point of Interest (POI) data and the census data. The inter-city linkage strength and internal-external linkage strength are used to evaluate the spatial linkages of the urban agglomeration. Results show that the spatial connectivity and public facilities have a very strong positive correlation with the inter-city accessibility. Meanwhile, providing transport infrastructure and services could greatly strengthen the accessibility between two areas. Job-housing ratio is used to analyze the distribution of residents and jobs. Results show that the job-housing ratio declines gradually from the central urban areas to the surrounding areas. Distributions of large-scale hospitals and schools are used to estimate the service capacity of public facilities. Results show that public service facilities are mainly concentrated in the developed areas. On the contrary, the other areas developed slowly due to lack of supporting service facilities.

1. Introduction

The global village is a vivid description of the world becoming increasingly connected. Although it is certainly true in some communication fields, especially the instant messaging, e.g., e-mail, telephones, internet, and some social network service (such as Facebook, Twitter, Vine, Tumblr, Instagram, and so on), most physical links between locations, and thus the time it takes to move between them, are still bound by the cities’ available infrastructures when having an actual trip [1]. Advancing accessibility is the core of the Sustainable Development Goals (SDGs), which is proposed by the Department of Economic and Social Affairs of the United Nations in 2015, in which they clearly demanded that the government should improve accessibility in some key services, such as education programs, health services, and banking and financial institutions [2]. This implicitly or explicitly highlights the fact that accessibility is a key concept in sustainable development, transport policies, reducing the negative impacts on the environment and safety across the world. It is worth noting that cities are the main carrier of such activities. Therefore, most studies have focused on the inner city’ accessibility as well as this paper.

Because of the significance of accessibility for city dwellers and policymakers, researchers have studied accessibility in many ways, ranging from defining the concept of accessibility [38], measuring accessibility [5, 7, 916], the relationship between the built environment and active travel or physical activity [1722], to discussing the relationship between accessibility and equity [2328]. For an overview of the literature on accessibility see, for example, Geurs and van Wee [12], Iacono et al. [13], Maghelal and Capp [14], Páez et al. [29], Talen and Koschinsky [15], Vale et al. [30], or Kelobonye et al. [31].

Trips arise from the spatial separation of people’s desired purposes (working, going to schools, shopping, going to hospitals etc.). Therefore, city’s spatial accessibility is an important feature of spatial separation, reflecting the ease of reaching opportunities [30]. It is related to spatial connectivity and public facilities which are a combination of the urban spatial structure layout and schedule. Higher separation implies lower accessibility. It also offers a framework for assessing the validity of urban spatial structure, distribution of urban public facilities, land use and transport systems. Accessibility can be promoted by altering the characteristics of the built environment, e.g., density (residential and employment), diversity (land-use mix and destinations), distance (proximity), route characteristics (street connectivity and quality of the infrastructure), safety (both personal and from traffic), aesthetic qualities (trees, parks and open spaces, bus shelters, etc.), and the topography [30, 3236]. In general, two methods are often used in obtaining accessibility by altering the built environment [31]. One is grouping, e.g., co-locating habitations, hospitals, schools and jobs, and the other is providing transport infrastructure and services, e.g., building roads, subway, submarine tunnel, highway and providing convenient services to link habitations with hospitals, jobs, schools and other major destinations [31, 37]. Good accessibility makes people more effective in participating in a variety of activities such as at school, at work, in a hospital, social interactions, and so on [31, 38, 39]. Instead, poor accessibility provided by cities is the major barrier to improved livelihood and sustainable development [1, 9].

As stated earlier, accessibility researches are rich and varied. However, the variety of measures of accessibility and their complexity bring a huge challenge to researchers. This, on the other hand, makes them more complicated for policy makers, which hinders their application in different cities [31, 40]. How to simplify measures of accessibility and make it easier to accept for policy makers? What is exciting is that with the development of social networking, mobile internet, electronic commerce and so on, various data are growing exponentially. Take Baidu maps as an example, in 2015, it responded to 23 billion positioning requests every day. In 2017, it increased to 80 billion positioning requirements and covered 600 million devices. The vast amount of data provides some new methods with better visualization, easier to understand and more efficient in measuring the accessibility. Considering this, in order to show an intuitive, simple, and easier to understand analysis for city’s policy makers and practitioners, different from the traditional measures on accessibility, this paper tries to explore a new analysis on city’s accessibility from the angles of spatial linkage strength, job-hosting ratio and distributions of large-scale hospitals and schools based on four kinds of data are gathered, i.e., navigation data, location based service (LBS) data, Point of Interest (POI) data, and the census data. The Humanities and Business Geography of Qingdao city are revisited from the perspective of data, which provides a reference for the evaluation of other urban situations. This paper is organized as follows. Section 2 presents the dataset used in our study and acquisition method. Section 3 gives the different data mining methods in Baidu Huiyan Platform. Section 4 analyzes the characteristics of urban spatial connection of Qingdao, which includes inter-city linkage, internal-external linkage strength, spatial relation between job and housing and distribution analysis on service facilities. Some concluding remarks and future research are presented in Section 5.

2. Multisource Data and Acquisition Method

Four kinds of data are used in this paper, i.e., the navigation data, location based service (LBS) data, Point of Interest (POI) data, and the census data. The navigation data and LBS data are mainly from Baidu Huiyan platform. The POI data are from Amap and the census data are from Qingdao Statistical Bureau.

2.1. Navigation Data

In this paper, navigation data are used to analyze the inter-city linkage strength, internal–external linkage strength and spatial relation between job and housing of Qingdao urban areas.

Electric map Application Programming Interface (API) (e.g., Google Maps, Apple Maps, Baidu Maps, Auto Navi Maps, and so on) could provide the trip route and time information. Here Baidu Map supplies Web API v2.0 service is used to obtain the navigation data. Tables 1 and 2 show the data sample of bus and car, respectively [41].

As shown in Tables 1 and 2, for bus travelers, navigation data include the route length, travel time, initial walk time, initial walk time, travel distance by bus, travel time by bus, arrival walk distance and arrival walk time, and so on [41]. For car travelers, it includes OD number, departure time, the origin coordinate, passing intersections, the destination coordinate, arrival time, and time cost.

In order to check the data validity of the Baidu Maps’ navigation data, Gao et al. had a taxi follow investigation [41]. They found that the average matching rate of total OD pairs is 90.74%, which reflects high accuracy of Baidu electric maps API.

2.2. Location Based Service (LBS) Data

In this paper, LBS data are used to analyze the inter-city linkage strength, the internal–external linkage strength and the spatial relation between job and housing of Qingdao city.

Baidu Map has a strong location capability. In 2017, Baidu Map received more than 80 billion location requirements per day. Its location equipment comes from 115,000 developers, 650,000 apps and websites. Location data’s accuracy will be more than 85% by model training and machine learning (http://www.fromgeek.com/liuyong/96032.html). Figure 1 gives the roadmap of population identification. According to Figure 1, workplace can be determined by the following conditions:(1)Appearing more than 100 times in the same place in two months.(2)The occurrence time is concentrated between 8:00 and 19:00.(3)The Wi-Fi (Wireless Fidelity) of the connection is fixed.

In general, workers have fixed workplaces. The first condition locks a person’s fixed activity place. Meanwhile, for most jobs, the working time is between 8:00 and 19:00. Therefore, conditions (1) and (2) basically determine a person’s workplace. Based on conditions (1) and (2), the third condition is used to further confirm a person’s workplace.

Similarly, the place of residence could be determined if the location meets following three conditions:(1)Appearing more than 100 times in the same place in two months.(2)The occurrence time is concentrated between 20:00 and 7:00 on the next day.(3)Most weekends are located at this place.

We can pay for the LBS data by this contact page: https://huiyan.baidu.com/contact.html.

2.3. Point of Interest (POI) Data

Here POI data are used to analyze the spatial relation between job and housing, distributions of large-scale hospitals and schools.

Point of Interest (POI) data include three kinds of information, i.e., name, classification, and their coordinates. In China, Baidu map, Amap, and Map World et al. supply the open POI interface. In this paper, we will use the POI data of Amap. In Amap, POI data are classified into 24 categories, 262 sub-categories. Users could apply the unique key by landing Amap web API service on the website: https://lbs.amap.com/api/webservice/guide/api/search. Then all the POI data will be acquired by using the unique key. Table 3 gives the style of Amap’s POI data.

As shown in Table 3, POI data includes the name, its categories, longitude, and latitude.

2.4. The Census Data

In this paper, the census data is used to analyze the spatial relation between job and housing, distributions of large-scale hospitals and schools.

In 2010, the Chinese Government had the sixth census nationwide. It includes the gender, age, nation, educational level, industry, profession, population mobility, social security, marriage and childbearing, death, housing, and so on. The census data used in this paper is from Qingdao Statistical Bureau. Specifically, the census data could be obtained by the following website: http://qdtj.Qingdao.Gov.Cn/n28356045/n32561056/n32561073/n32561270/index_2.Html.

3. Data Mining Methods

Some data mining, and clustering algorithms are used for identifying the permanent resident population, distinguishing the urban function area and distinguishing the commute mode. Specifically, their brief introductions are as follows.

3.1. XGBOOST Algorithm

XGBOOST is a scalable tree boosting system [42], which is an improved algorithm based on the gradient boosting decision tree and can construct boosted trees efficiently and operate in parallel. The boosted trees in Xgboost are divided into regression and classification trees. The core of the algorithm is to optimize the value of the objective function. Unlike the use of feature vectors to calculate the similarity between the forecasting and history days, gradient boosting constructs the boosted trees to intelligently obtain the feature scores, thereby indicating the importance of each feature to the training model. The more a feature is used to make key decisions with boosted trees, the higher its score becomes.

Here the XGBOOST is used to excavate the permanent resident population in Qingdao city.

3.2. -means Algorithm

-means is the most popular and the simplest partitional algorithm. -means has a rich and diverse history as it was independently discovered in different scientific fields by Steinhaus [43], Lloyd (proposed in 1957, published in 1982) [44], Ball and Hall [45], and MacQueen [46]. Even though -means was first proposed over 50 years ago, it is still one of the most widely used algorithms for clustering. Ease of implementation, simplicity, efficiency, and empirical success are the main reasons for its popularity [47, 48]. In this paper, -means is used to distinguish the urban function area.

3.3. Random Forests Algorithm

Breiman [49] proposed random forests, which add an additional layer of randomness to bagging. In addition to constructing each tree using a different bootstrap sample of the data, random forests change how the classification or regression trees are constructed. In standard trees, each node is split using the best split among all variables. In a random forest, each node is split using the best among a subset of predictors randomly chosen at that node. This somewhat counterintuitive strategy turns out to perform very well compared to many other classifiers, including discriminant analysis, support vector machines, and neural networks, and is robust against overfitting [49]. In addition, it is very user-friendly in the sense that it has only two parameters (the number of variables in the random subset at each node and the number of trees in the forest), and is usually not very sensitive to their values. Here random forests algorithm is used to distinguish the commute mode.

4. Analysis of the Characteristics of Urban Spatial Connection

4.1. Analysis of Inter-City Linkage Strength of Qingdao

As shown in Figure 2, Qingdao urban city includes east area, west area and north area.

In recent decades, Qingdao’s urbanization has developed rapidly. The urbanization rate increased from 32.62% in 1990 to 71.53% in 2016 [50], which induces a drastic contradiction between the need of city’s rapid development and the insufficient urban land supply. Thus, the traditional city expansion mode is difficult to support the efficient operation of Qingdao city. Strengthening the interactive connectivity of urban regions is the key to solve the discrete problem of urban space. Here the linkage strength between different areas is denoted by the flow of people between different areas. According to the LBS data in one day, we can easily record people’s original location, activity location, and final location. Then the flow between different areas can be easily obtained. Here the average daily flow between different areas is given in Figure 3.

As shown in Figure 3, the average daily flow between east area and the north area is 5 times as much as between the east and west areas. The average daily flow between the east and the west areas is 5 times bigger than that between the west and the north areas. It is related to the urban location, total population of central areas, the convenience of transportation, basic service facilities, and so on. As shown in Figures 4 and 5, east area and north area are adjacent on land. Their linkage strength is high and thus has a good accessibility. However, the west area is separated by the Jiaozhou Bay. Their linkage strength is weak and thus has a poor accessibility.

4.2. Analysis of Internal–External Linkage Strength of Qingdao Urban Areas

Figures 57, and Table 4 show the internal-external linkage strengths of the north area, East area and West area. Early in its development, Qingdao city mainly expanded along the coastal plain area [51]. Due to the convenient transportation conditions and favorable geographical environment, the north area developed rapidly. As shown in Figure 8, the airport, Jiaozhou Bay expressway and the Jiqing expressway further strengthened the north area’s relation with the surrounding districts, such as Licang District, Jimo District, and Shibei District. As shown in Table 4, the proportions of flow between the north area and its three peripheral areas (Licang District, Jimo District, and Shibei District) are 34%, 24%, and 17%, respectively.

The opening of Jiaozhou Bay Bridge and Cross-Harbour Tunnel strengthen the connection between the east area and the west area (Figure 9). Compared to the data of household travel surveys in 2010, the traffic linkage strength between east area and west area increased by 5% in 2011, which confirms that the construction of transportation infrastructure could promote the accessibility [52]. The linkage strength between the west area and the other peripheral areas (such as Pingdu District and Laixi District) is weaker than the east area. This is because the east area is the traditional urban area and administrative center. Some business and residential areas are clustered here and thus it has strong internal-external linkage strength with other areas.

4.3. Analysis of Spatial Relation between Job and Housing of Qingdao

“Job” and “housing” are two key variables in the evolution of urban spatial structure, which determine the distribution characteristics of commuters and traffic demand of urban residents.

Based on Baidu huiyan map data and the sixth census data of Qingdao, the resident population and working population in different administrative districts are analyzed. Figure 10 shows the job-housing ratio of different areas of Qingdao.

As shown in Figure 10, the job-housing ratio of different areas varies greatly. It declines gradually from the central urban areas to the surrounding areas. In east area, the population is roughly the same as the number of workers. The ratio of employment to the resident population is as high as 0.8. Most people work nearby, effectively avoiding long-distance commuter travel. Statistics also indicate that the internal travel of the east area accounts for 87%, much higher than 59% of the north area and 78% of the west area. Jobs and housing are concentrated in three districts (i.e., Shinan District, Shibei District, and Laoshan District), and their average job-housing ratios are higher than that of the peripheral areas (North area, west area and other areas, such as Jiaozhou District, Jimo District, Pingdu District, Laixi District). In particular, Shinan district has the highest job-housing ratio (the value is 1.2). Residents and jobs are mainly distributed along the Hong Kong road, Yan’an 3 road, Shandong road, Nanjing road, Fuzhou road. Blocked by mountains, Laoshan District is relatively independent and the jobs are concentrated on both sides of Haier road and the job-housing ratio is close to 1.

Figures 11 and 12 give the thermodynamic charts of urban job and housing distribution. Most jobs and housing are located in the east area. In east area, the resident population is mainly concentrated in the north of Shibei District and Licang District, which accounts for more than 65% populations of east area, while the posts are less than 55%, which forms the spatial pattern of working in the South (Shinan District) and living in the North (the north of Shibei District and Licang District). This spatial distribution results in tidal traffic phenomenon in the morning and evening peak. Taking Shandong road and Metro Line 3 as examples, we would find the tidal traffic phenomenon in east area is serious. Figure 13 gives two photos in the morning peak and evening peak in a working day. As shown in Figure 13(a), the direction of the main traffic flow is from north to south in the morning peak. In the evening peak, the direction of the main traffic flow is from south to north (Figure 13(b)). Figures 15 and 16 show the average distribution of Metro line 3’s passengers in morning peak and evening peak in working days. In the morning peak, the number of passengers from the Qingdao North Railway Station to the Qingdao Railway Station is larger than that of the reverse passenger flow (Figure 15), and the evening peak is just the opposite (Figure 16). Figure 14 shows the locations of Shandong road and Metro line 3 in the map.

4.4. Distribution Analysis on Service Facilities in Qingdao

Figures 17 and 18 show the distributions of urban public service facilities and commercial service facilities respectively. As shown in Figures 17 and 18, public service facilities mainly concentrated in the south of east area.

According to Zhao et al. [52, 53], the spatial service efficiency of Qingdao is only 1/6 of the plain city because of the limited terrain of Laoshan Mountain and Jiaozhou Bay. Statistics found that the three built-up districts (i.e., Shinan District, Shibei District, Licang District) in east area only occupy 31.4% of the urban land area. However, they concentrate 54.3% of public service facilities (science and education, sports and leisure, transportation facilities, scenic spots, etc.) and 58.5% of commercial service facilities (catering facilities, shopping malls, hotels, etc.). More specifically, Shinan District takes only 5.4% of urban land area, but 15.5% of public service facilities and 18.4% of commercial service facilities.

On the contrary, in the early stages of Laoshan District and Huangdao District, they developed slowly due to lack of supporting service facilities. Taking Huangdao District (excluding Jiaonan) as an example, the built-up lands area accounts for 27% and the population accounts for 19%. However, commercial service facilities only account for less than 12%.

Distribution of hospitals and population coverage are shown in Figures 19 and 20, respectively. From Figure 19 we can see that in the central urban area, uneven distribution of medical facilities is obvious, and large-scale medical facilities are mainly concentrated in the inner-districts, such Shinan district, Shibei district, Licang district, and so on. Statistics found that there are 47 big public hospitals in the central area. However, 80% of them are concentrated in Shinan district, Shibei district, Licang District. Laoshan District, Chengyang District and Huangdao District (excluding Jiaonan) lack large medical institutions.

As shown in Figure 20, there is an imbalance between medical facilities supply and demand. It is difficult to meet the medical demand for community-level hospitals. Big public hospitals cover 24% of the permanent population (1.45 million persons) within 1 kilometre distance. Within 1-2 kilometre distance, they cover nearly 1.9 million permanent residents, which account for 30% of population. However, there are still 1.05 million (17%) of the permanent population who need to travel more than 5 kilometres to large hospitals for medical treatment.

Figures 21 and 22 give the distribution and population coverage of primary and secondary schools. The layout of primary and secondary schools basically matches with its service population. There are 367 primary and secondary schools in the urban area. Shinan district, Shibei district, and Lichang district account for about 50% of primary and secondary school resources, which serve 47% of the resident population.

According to the “Code for design of school” in China, the service radius of complete primary school is 500 meters and the service radius of junior high school is 1000 meters (http://www.fromgeek.com/liuyong/96032.html). However, as shown in Figure 15, within 1 km, primary and secondary schools only cover 75% students. Therefore, there are still large service blind area in Qingdao city.

At the same time, it is found that the resident population covered by primary and secondary schools shows a significant downward trend with the growth of distance, which is basically consistent with the concept of “going to school nearby, living close to school” (Figure 23) in Qingdao. Most permanent residents (2.75 million) live within 500 meters of the central area, accounting for 45% of the total population. Within 500–1000 meters distance, the resident population reaches 1.84 million, accounting for 30% of the total residents. However, over 3,000 metres distance, it covered 75,000 people, which only accounts for 1.2% of the total population.

5. Conclusion and Future Research

This paper discusses urban sustainable development characteristics of Qingdao based on the navigation data, location based service (LBS) data, Point of Interest (POI) data and the census data. Spatial linkage strength, job-housing ratio and distributions of large-scale hospitals and schools are used to analyze the city’s development characteristics.

The inter-city linkage strength and internal-external linkage strength are used to evaluate the spatial linkages of the urban agglomeration. In the inter-city linkage strength analysis, results show that the average daily flow between east area and the north area is 5 times as much as the east and west areas. We find that east area and north area are adjacent on land and their traffic connection is convenient. However, the west area is separated by the Jiaozhou Bay, which faces the east area and north area across the sea. In a word, urban spatial structure determines the city’s accessibility and the spatial connectivity and public facilities have a very strong positive correlation to the inter-city accessibility.

In the internal-external linkage strength analysis, we find that Jiaozhou Bay expressway and the Jiqing expressway strengthen the north area’s relation with the surrounding districts. Meanwhile, with the construction of sea-crossing bridge and submarine tunnel, the communication between East and West is becoming closer and closer, which further confirms that providing transport infrastructure and services could greatly strengthen the accessibility between two areas.

Job-housing ratio is used to analyze the distribution of residents and jobs. Results show that the job-housing ratio declines gradually from the central urban areas to the surrounding areas. Most jobs and housing are located in the east area. This spatial distribution results in tidal traffic phenomenon in the morning and evening peak.

Distributions of large-scale hospitals and schools are used to estimate the service capacity of public facilities. Results show that there is an imbalance between medical facilities supply and demand. It is difficult to meet the medical needs of residents for community-level hospitals.

The layout of primary and secondary schools basically matches with its service population. However, within 1 km distance, primary and secondary schools only cover 75% students, which still has large service blind area. In a word, public service facilities are mainly concentrated in developed areas. On the contrary, in early stages, other areas developed slowly due to lack of supporting service facilities. According to our analysis, in order to accelerate the development of undeveloped areas, for medical facilities, the government should increase the number of the community-level hospitals to balance the medical supply and demand. For primary and secondary schools, the government should improve their service scope.

Here only four kinds of data are used. In future, more data source can be taken into account in accessibility research, such as travel survey data, traffic detection data, and so on. Here only spatial linkage strength, job-housing ratio, and distributions of large-scale hospitals and schools are used to measuring the city’s accessibility. In future research, other measures should also be considered, such as land use, traffic network, traffic mode, and so on.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

Conceptualization—Z. Wang and G. Gao; methodology—G. Gao and T. F. Li; software—Z. Wang; validation—G. Gao, Z. Wang, and X. M. Liu; formal analysis—G. Gao; investigation—X. M. Liu; resources—Z. Wang; data curation—Z. Wang; writing-original draft preparation—Z. Wang; writing-review and editing—G. Gao; visualization—Z. Wang; supervision—X. M. Liu.

Funding

This research was funded by the National Nature Science Foundation of China (71801144, 71371111), China Postdoctoral Science Foundation Funded Project (2019M652437), the Scientific Research Foundation of Shandong University of Science and Technology for Recruited Talents (2019RCJJ014), Shandong Postdoctoral innovation project (201903030), and Shandong Key Research and Development Program under Grant 2018GHY115022.

Acknowledgments

We would like to thank Qingdao Statistical Bureau for the sixth census data.