Abstract

During the process of urban development, there is large-scale laying of underground pipeline networks and coordinated operation of both new and old networks. The underground concrete drainage pipes have become a focus of operation and maintenance due to their strong concealment and serious corrosion. The current manual inspections for subterranean concrete drainage pipelines involve high workloads and risks, which makes meeting the diagnostic needs of intricate urban pipeline networks challenging. Through advanced information technology, it has reached a consensus to intelligently perceive, accurately identify, and precise prediction of the condition of urban subterranean drainage networks. The development process of detection and evaluation methods for underground concrete drainage pipe networks is the focus of this study. The study discusses common algorithms for classifying, locating, and quantifying pipeline defects by combining the principles of deep learning with typical application examples. The intelligent progression of information collection methods, image processing techniques, damage prediction models, and pipeline diagnostic systems is systematically elaborated upon. Lastly, prospects for future research of intelligent pipeline diagnosis are provided.

1. Introduction

The sustainable development of urban areas is a strategic priority with a focus on people and the dual carbon goals. As urbanization progresses at an accelerated pace, the length of various urban pipelines laid to support city development continues to increase. According to the data from the National Bureau of Statistics of China, as of the end of 2024, the total length of urban drainage pipelines in the country reached 91.35 million kilometers, an increase of 47.45 million kilometers from 2012, and continued to grow at an annual growth rate of 7.50%. As China’s urban infrastructure shifts from large-scale incremental construction to quality improvement of existing assets and balanced development of incremental structures, the drainage pipe network, as the “vascular” system of the city, plays an important role in ensuring public health and environmental quality in urban areas through safe and efficient operation [1]. Ensuring the safe operation of urban drainage systems, improving their operational quality, enhancing service efficiency, and enabling them to better adapt to the urban development process are significant priorities for the construction of smart cities and are essential for ensuring the sustainable development of urban areas. However, the pipelines that are crucial to China’s urban drainage systems were mostly constructed in the 1960s. Due to limited cognitive ability and technological level, these pipelines have aged and become severely clogged. The early design and planning of the urban drainage pipeline network lacked systematic and forward-looking approaches, resulting in varying degrees of damage to most pipelines. This damage is particularly prominent in pipelines that have been in service for over 20 years, severely impacting the normal operation of the pipeline network and restricting the full functionality of the city.

The overall operation efficiency of urban drainage systems in the world is low, with common structural and functional defects. This results in inadequate drainage safety redundancy and significant hidden flood control and drainage hazards [2]. Due to the complexity, concealment, and difficulty of operation and maintenance of urban drainage systems, there are often problems such as leakage, pipe burst, siltation, aging, and fracture, which pose a potential threat to the health of residents. In the United States, there are 23,000 to 75,000 cases of sewage overflowing from sewage pipelines every year [3, 4]. The discharge of untreated wastewater directly contaminates surface and groundwater sources and indirectly contributes to the spread of human diseases. The repeated excavation and repair of drainage pipelines not only consume a lot of manpower and material resources but also seriously affect people’s travel [5]. In July 2021, heavy rain in the central and western regions of Europe led to large-scale floods, resulting in over a hundred deaths in Germany alone. On February 22, 2019, a burst in the underground sewage pipeline in Hancheng New Town, Shaanxi, led to the collapse of the road surface and obstructed the passage of motor vehicles. The overall operational efficiency of urban drainage systems in major cities is generally low, which not only triggers localized collapses and other safety accidents but also leads to urban waterlogging, posing a threat to people’s lives and property, and even causing huge economic losses to society, especially during extreme rainfall events. Meantime, Zhengzhou also experienced extremely heavy rainfall, causing severe waterlogging and resulting in 380 deaths or disappearances, with direct economic losses reaching 40.9 billion yuan. In August 2023, heavy rainfall in North China caused a severe flood control situation in multiple cities, with some affected areas experiencing water retreat for up to about a month, posing demands on the “resilience” of urban underground drainage networks to adapt to environmental changes and respond to natural disasters. The New Urbanization Implementation Plan for the 14th Five-Year Plan period of China explicitly emphasizes the need to intensify efforts to control urban waterlogging and eradicate the phenomenon of “cities turning into seas.”

Aforementioned challenges are not only associated with the absence of proactive planning and design but also highlight the issue of limited diagnostic capabilities of urban drainage networks. Rapid and precise assessment, early warning, intelligent control, and the achievement of intelligent operation and maintenance are urgently needed due to the extensive temporal gap between the establishment of old and new pipeline networks, as well as the intricate operational environment. Scientific detection and accurate diagnosis of pipeline defects serve as the foundation for intelligent operation and maintenance [6]. Currently, underground drainage pipeline defect diagnosis in China primarily relies on manual detection and identification, with the predominant use of closed-circuit TV detection (CCTV) [7]. Combined with calculation and analysis, the safety performance of underground drainage pipelines is evaluated by scoring and assessing relevant regulations and standards (Figure 1(a)). However, these methods are time-consuming, labor-intensive, and ineffective. Extensive excavation causes significant disruption to daily activities and is susceptible to the subjective experiences of inspection personnel. This leads to a high rate of error and missed detections, inconsistent evaluation outcomes, and inaccurate defect-level assessments. Therefore, there is an urgent need to implement a high-efficiency, cost-effective, automated, and intelligent method for diagnosing underground drainage pipelines. This will enhance the efficiency and accuracy of pipeline diagnosis and maintenance, while also conserving human, material, and financial resources.

Since the inception of the concept of artificial intelligence in the 1940s, machine vision technology utilizing deep learning has experienced significant advancements, leading to the practical implementation of “smart” cities [8]. After the widespread interest in artificial neural networks within the realm of small sample data mining and model prediction [9], the intelligent detection technology utilizing convolutional neural networks and multisensor fusion has significantly enhanced the capability to acquire defect information in drainage pipelines [10]. Furthermore, it has the potential to conserve human and material resources, while concurrently improving detection efficiency and accuracy (Figure 1(b)). While pipeline diagnosis technology has made progress in positioning and identification, research in utilizing intelligent methods to assess existing pipeline damage and offer repair recommendations is still in its early stages. This study aims to examine the research progress of intelligent detection technology for underground drainage pipelines, focusing on intelligent perception, classification, identification, evaluation of pipeline defects, and intelligent diagnosis of pipeline service performance. Additionally, it will analyze the existing research issues and prospects for research directions and key technologies to be breakthrough in the field.

2. Underground Concrete Drainage Pipeline Defects and Traditional Diagnostic Methods

2.1. Classification and Cause Analysis of Pipeline Defects

Urban drainage pipeline materials mainly include concrete, plastic, metal, and other composite materials [11]. Compared to other materials, concrete exhibits superior compressive strength, well-established production processes, convenient sourcing, and a longer service life; hence, it is the predominant choice for pipe material [12]. In major cities globally, a substantial number of concrete pipelines have been in use for multiple decades, and in some cases, for over a century [13]. As the service life of the pipeline increases, there is a rapid decrease in its performance and efficiency. When the design bearing capacity of the pipeline is less than the pressure generated by the surrounding filling, the thickness and bearing strength of the pipe wall do not meet the use standard, the coating of the pipeline is damaged, the service time of the pipeline is too long, and the uneven force of the pipeline leads to sinking deformation, rupture, surrounding geological conditions, and other uncontrollable factors, which may lead to structural defects of the pipeline. When the ground sediment, industrial, and domestic waste are collected into the drainage pipeline and deposited at the bottom of the drainage pipeline, resulting in poor drainage of the pipeline, it may induce functional defects of the pipeline. The common defects of the pipeline are shown in Figure 2.

Beyond external factors such as topography, climate, and soil characteristics in the city and region where the pipeline is situated, the design and construction technology of the pipeline itself also influence its service life [14]. Furthermore, neglecting the design of durability redundancy and response to specific environmental conditions may heighten the risk of pipeline damage, hasten corrosion, and shorten its service life. In areas where microbial corrosion is prevalent, the application of specialized cement mortars such as calcium aluminate cement mortar and cement-geopolymer cementitious materials can notably enhance pipeline resistance to chemical and biological sulfuric acid. Additionally, the utilization of antibacterial nanomaterials or the inclusion of cement mortar lining can effectively bolster the concrete’s corrosion resistance [15]. The long-term performance of pipelines has become significantly more intricate due to the changing pipeline materials and their associated influencing factors. To minimize the financial loss of inadequate pipeline management and maintenance, accurate and efficient diagnosis of pipelines has become an important part of the operation and maintenance of urban pipeline networks.

2.2. Traditional Pipeline Detection and Diagnosis Methods

The detection methods of early concrete underground pipelines highly rely on manual operations, including personnel diving, mirror inspection, and bucket mud method. These methods not only have limited detection accuracy but also have poor working conditions and even threaten the personal safety of operators. With the development of detection technology, closed-circuit television (CCTV) [7], pipeline sonar detection technology [16], pipe quick view inspection (QV) [17], ground penetrating radar detection [18], multisensor detection [19], sewer scanner and evaluation technology (SSET) [20], infrared temperature recording and analysis [21], and other technologies are gradually applied to practical engineering.

The CCTV pipeline detection system emerged in the 1950s and reached maturity in the 1980s. The system primarily consists of a crawler, controllable camera, cable, and ground console. Operators utilize the main controller to maneuver the crawler within the pipeline, capturing video and images, transmitting them to the main controller display via the cable, and subsequently analyzing and evaluating the acquired pipeline video or image information with the assistance of professional technicians. This detection technology requires substantial manpower and time. The results of the detection can be influenced by subjective factors, leading to potential issues such as false and missed detections. Furthermore, it imposes specific requirements on pipeline water level and sludge depth. In field engineering applications, it is often combined with other means to complete the detection with CCTV according to different scenes. For instance, portable QV may be utilized for small-scale inspections following pipeline dredging, while pipeline endoscopic sonar detection technology may complement CCTV inspections. It is important to note that sonar detection is primarily employed to identify functional defects in pipelines below the liquid level and is limited in its ability to detect the whole structural flaws.

In recent years, more sophisticated detection devices have been developed compared to the aforementioned equipment. For instance, the multisensor detection system comprising a microwave sensor system, optical triangulation system, and acoustic system can accurately detect the geometric dimensions of drainage pipelines, assess the surrounding soil condition, and identify various pipeline defects, such as water leakage, scaling, and corrosion. This integrated detection system amalgamates multiple sensors, incorporating the strengths of various detection techniques, and is suited for a wide array of application scenarios. Nonetheless, the complexity of the system’s composition, its low stability, and the high cost of commercial sensors present challenges in terms of widespread adoption.

As materials, structures, and equipment devices continue to be optimized and improved, the variety of pipeline robots has expanded. Pipeline inspection has emerged as a prominent topic within the realm of robotics (Figures 3(a) and 3(b)). Comprehensive reviews of the progress and breakthroughs of pipeline inspection robots on a range of key technical issues have been documented in existing literature [22, 24]. European countries initiated research on pipeline robots at an earlier stage. In 1978, J. VWERTUT from France commenced exploration in the field of pipeline robots and developed the leg-wheel type pipeline walking mechanism model IPRIV [25]. The pipeline robot MAKRO [26] developed by B. Klaassen and others in Germany has clear module functions and strong obstacle-crossing ability, but it has a large weight, long length, poor bending capability, and a higher probability of getting stuck during inspection. The S series pipeline inspection robot developed by China’s Shenzhen Schroeder company has good obstacle-surmounting ability. The 360° camera loaded by the robot can automatically focus on the damaged position in the pipeline, complete automatic detection, and issue a report. However, the robot is expensive and requires special personnel for later maintenance. Currently, it is only utilized by a limited number of pipeline inspection and repair companies. The established products in the global market include the ROVVER series robots from the United States and the IPEK pipeline endoscopic detection robots from Germany, both equipped with corresponding operational systems and control terminals. The ROVVER series boasts diverse functionality and robust environmental adaptability. However, it faces challenges in terms of higher cost and limited multifunctionality compared to the IPEK robot. Recent studies have introduced more advanced drifting in-pipe robots [23] that can be powered by fluid or internal sources to maneuver within the pipeline. This approach minimizes data acquisition susceptibility to environmental constraints and facilitates collaborative multiuser engagement to enhance detection efficiency (Figure 3(c)). However, while collecting information, these robots also encounter challenges such as acquiring redundant data and undergoing complex data analysis. Although the improved residual attention-based method for recognizing drainage pipeline ailments significantly enhances the robot’s recognition accuracy and efficiency, algorithmic optimization alone cannot compensate for the drawbacks of high procurement costs and short maintenance cycles, thus impeding the widespread deployment of drifting robots in practical engineering. It is envisaged that through the enhanced integration of diverse technologies (e.g., geographic information systems), concurrent positioning and mapping, and the implementation of multirobot operations, the data gathered during pipeline inspections will become more comprehensive and of higher quality.

In tandem with the continuous evolution of drainage pipe detection methods, the relevant theories are gradually perfected and the pipeline data are becoming increasingly abundant, and researchers from various countries have attempted to combine engineering testing with laboratory simulation to obtain relevant parameters [27], establish mathematical models under the influence of multiple pipeline damage factors [28], and formulate regulations for the operation and maintenance of drainage networks, such as the Sewerage Rehabilitation Manual (SRM) in the UK [29], the Pipeline Assessment and Certification Program (PACP) [30] in the US, and China’s Technical Specifications for Inspection and Evaluation of Urban Sewer (CJJ-181-2012) (referred to as the Specifications) [31]. Taking the Specifications as an example, the evaluation method involves treating the pipeline between two adjacent inspection chambers as a single segment for a comprehensive assessment. When conducting structural defect evaluations for pipeline segments, structural defect parameters, segment damage condition parameters, and repair indices are introduced. Defect types are divided based on defect density value intervals, and repair levels are determined, thus formulating repair plans for damages of different severity. When conducting functional defect evaluations for pipeline segments, the assessment process does not consider the impact of soil parameters since the drainage flow inside the pipeline is independent of the soil characteristics. Furthermore, the principles for determining functional defect parameters, operating condition parameters, and functional defect density are fundamentally similar to those for structural defect evaluations. As an example, the survey and evaluation process for CCTV inspection is illustrated in Figure 4.

Moreover, the Specifications present specific requirements for the pipeline inspection preparation, safe and orderly operation, and management supervision. They also stipulate regulations for on-site operations, management, inspection, maintenance, and repair related to the evaluation work, as well as associated industries such as highways, transportation, and shipping. Nevertheless, the Specifications impose greater demands on the professional expertise of evaluators. Owing to the limited evaluation data for pipelines of diverse materials, the evaluation process for individual pipeline components is more intricate, resulting in certain deficiencies in the Specifications. For instance, the grading of defect evaluation fails to account for the diversity of pipeline materials, and it lacks evaluation methods for manholes and rainwater inlets. Furthermore, subjective provisions persist in the evaluation of pipeline defects and the subsequent maintenance recommendations, without a comprehensive quantification.

3. Intelligent Identification and Analysis of Underground Concrete Drainage Pipeline Defects

3.1. Pipeline Detection Method Based on Traditional Machine Vision

The integration of traditional pipeline detection methods with machine learning, driven by the advancement of bionic theory and artificial intelligence technology, has emerged as a crucial avenue for future development in pipeline detection technology. Moselhi [32] pioneered the use of image processing for automated pipeline defect detection in 1999. Subsequent advancements and refinements in machine vision technology have led to significant progress in the localization, categorization, and identification of drainage pipeline defects.

As pipeline detection relies on machine vision, the image processing effect must be closely associated with the clarity and reliability of the original data. Traditional cameras are inadequate for drainage pipeline detection, which demands comprehensive information, dynamic tracking, and clear details. According to [33], active panoramic vision technology can effectively address the structure from motion (SFM) problem by tracking feature points and their corresponding three-dimensional coordinates. The complexity of pipeline images limits the accuracy of the three-dimensional pipeline model. As various detection systems mature, capturing images or videos that accurately represent the actual conditions of pipelines has become a crucial data source for machine learning. Meijer [34] enhanced image quality by using the Panoramo system with stroboscopic lights to capture still images at 5-centimeter intervals, instead of video recording, thereby significantly improving data clarity. However, it is essential for a fully automated pipeline detection method to not only maintain the semantic continuity of video data but also minimize human intervention in the image acquisition process. In cases of insufficient data [7], data augmentation methods such as flipping, rotating, scaling, mirroring, and color adjustment should be employed to expand the dataset [2]. The discussion of data acquisition methods is integral to the entire process of machine vision development.

The images obtained from pipeline inspection contain extensive information regarding the type, location, and extent of pipeline defects. The efficient utilization of this information has been a primary focus of contemporary research. It is widely acknowledged that highlighting damage information and reducing background interference through image preprocessing are fundamental to extracting pipeline defect features [20]. For instance, dynamic histogram equalization enhances local contrast without affecting overall image contrast, thereby emphasizing pipeline defect details [35]. Mean, Gaussian, and other filtering operations are employed to reduce image noise [36]. Furthermore, morphological [37] methods are used to process pipeline crack images, enabling the determination of parameters such as crack length and width through pixel resolution calculation [38] and the estimation of severity. Using the above method to operate sequences and segment pipeline defects based on different types of images provides a feasible approach for processing a large number of original images. In the era of today’s highly integrated machine vision tools, most image preprocessing functions can be invoked from OpenCV, thereby greatly facilitating subsequent image feature extraction.

The process of identifying and extracting defects is crucial for analyzing pipeline damage and evaluating its operational status. In the early days, image segmentation techniques frequently employed methods such as thresholding [39], edge detection [40], geometric fitting [41], or complex methods that combined two or more of the above algorithms. For instance, Hawar [42] applied the Gabor filter and edge detection algorithm to segment sediment in the pipeline, followed by the use of the least squares estimation in geometric fitting to assess pipeline deformations. Some scholars have also endeavored to achieve breakthroughs in clustering algorithms by empirically determining the parameters k for k-means and quick fuzzy C-means (QFCM) through multiple experiments [43, 44]. However, it is necessary to select appropriate algorithms based on the specific task and subsequently achieve the optimal solution, considering factors such as the complexity of the image, size of the target, and resolution. These requirements demand extensive prior knowledge and considerable time, thus limiting their widespread adoption. Thanks to the emergence of artificial neural networks at the turn of the century, multilayer perceptron (MLP) and its derived radial basis function networks (RBF) have greatly accelerated the speed of machine learning and avoided the local minimum problem to some extent. At that time, one of the optimal solutions for addressing multiclass classification problems was to leverage RBF to handle the internal damage and background of the pipeline, and to integrate the probabilistic feature space method into the visual system of the pipeline detection robot [45].

Nevertheless, the above methods are highly dependent on the manually designed feature extractor in the feature extraction stage, which can only detect specific defect categories or pipeline types, and the internal program is constructed for a specific defect, making it challenging to train a strong generalization model. In addition, even if traditional image detection methods can be applied to the identification of various defects, the accuracy and robustness are low due to the high level of background image noise and the large amount of information. Faced with the problems and challenges of traditional machine vision at that time, the emergence of deep learning and the development of convolutional neural networks (CNN) have epoch-making significance.

3.2. Pipeline Defect Recognition Method Based on Deep Learning
3.2.1. Introduction of Deep Learning Image Recognition Technology

The concept of CNN can be traced back to LeNet-5 proposed by LeCun in 1998, which first defined the convolution in a seminal paper. However, due to limited algorithmic advancements, insufficient database resources, and constrained hardware resources at that time, the development of CNN was relatively slow. It was not until 2006 when Hinton [46] proposed the concept of deep learning that neural network models achieved a significant breakthrough in algorithmic advancements, and this kind of artificial neural network with deep structure gradually attracted the attention of scholars. In 2012, Hinton’s team achieved a significant triumph in the ImageNet image classification competition using an enhanced convolutional neural network [47], which led to the widespread recognition of deep learning in the academic community.

Compared to previous image recognition methods which require manual feature setting followed by screening and extraction processes, deep learning, as a subset of machine learning, can process large amounts of raw data using a deep neural network. It effectively mitigates over-fitting and also has great advantages in accuracy and efficiency. Convolutional neural networks are commonly utilized in the field of image detection as a type of feedforward neural network. A standard convolutional neural network consists of convolutional layers, pooling layers, activation function layers, and fully connected layers, as depicted in Figures 5(a) and 5(b), illustrating the network structure and feature extraction process [48].

In contrast to traditional computer vision and image processing technique, CNN requires less image preprocessing and does not need the design of complex feature extractors, leading to significant improvements in classification accuracy and generalization capability. It predicts categories by employing filters with initially random weights and biases and then passes the features to the next layer, calculating the error between the true value and the predicted score. Subsequently, it applies backpropagation to continuously adjust the weights and biases of the filters to achieve the optimal category. Specific to its operation process, the input of CNN needs to be normalized and processed using the aforementioned methods, along with gradient descent, to train input images and improve learning efficiency and result accuracy. The convolutional layers mainly rely on multiple preset convolution kernels to capture image contours and generate corresponding feature maps. The size of the “receptive field” is determined by the weight coefficients and bias of each element internally. When the convolution kernel runs, it performs a dot product, summing and accumulating bias within the receptive field of the previous layer’s feature map as shown in Figure 5(c). Images processed by CNN are usually of larger size and have more data. In order to reduce the training load of the neural network, pooling layers are generally added between the convolutional layers. This is based on the principle of local correlation, which reduces the data size while retaining useful information, thus reducing computational load and effectively suppressing overfitting. The commonly used methods are max pooling and average pooling, with Figure 5(d) illustrating the operation flow of max pooling. The activation layer is located after the convolutional layer and is used to build specific functional relationships between adjacent layers. It is achieved by functions such as ReLU, Sigmoid, and Tanh to achieve a nonlinear response, allowing the neural network to process and classify input information in a more complex manner. After the convolutional and pooling layers, there are usually one or more fully connected layers, which integrate the various features from the previous layer’s neurons through complete connections. At this stage, the feature maps lose their three-dimensional characteristics, and the output, under the action of the activation function, becomes a one-dimensional vector.

Following enhancements in network structure and training modes [47], the availability of annotated datasets [49], and advancements in computer hardware [50], convolutional neural networks have surpassed traditional image recognition patterns, achieving significant progress in object detection, facial recognition, motion capture, medical diagnosis, and various other fields. In the era of artificial intelligence, civil engineering has undergone a substantial transformation in planning, design, construction, maintenance, and disaster prevention [51, 52]. For instance, in intelligent operation and maintenance, computer vision-based crack identification has become a popular method for structural damage diagnosis. With the increasing maturity of noncontact remote sensing crack detection technology, the feasibility of image recognition of building surface defects has been validated, and techniques for quantifying cracks are continuously being refined [53]. Recent research [54] has started to organically integrate convolutional neural networks with two major regression prediction models (random forest, RF, and extreme gradient boosting, XGBoost) to establish a comprehensive framework for the automatic detection of concrete structure cracks and prediction of crack depth. Customized detection robots, such as underwater robots, road crack detection vehicles, and drones, have been developed for specific civil infrastructure such as roads [55], bridges [56], tunnels [57], and buildings [58], based on environmental variations, significantly enhancing detection efficiency and accuracy. The fusion of deep learning with the civil engineering industry has propelled the advancement of automated and intelligent building inspection. Drainage pipeline detection technology, as a special structure in civil engineering, has also made remarkable progress through the optimization of relevant algorithms and the development of robots for specific operating environments.

In recent years, deep learning-based convolutional neural network (CNN) models have increasingly been utilized for detecting and identifying defects in drainage pipelines. These models extract features from input images acquired through various detection methods, such as CCTV or QV, using CNN. To enhance detection performance in specific scenarios, several network structures have been proposed, such as YOLO, faster R-CNN, SSD, mask R-CNN, and SOLO, which are built on the original CNN structure. These algorithms have demonstrated promising results in construction engineering management domains, including personnel monitoring and status detection [69]. In the context of pipeline detection, researchers have categorized various CNN algorithms according to different task goals, such as image classification, object detection, and pixel-level segmentation. The segmentation task has further evolved into two types: semantic segmentation and instance segmentation. Table 1 provides a summary of the strengths and weaknesses of some typical CNN structures and their applications in the field of drainage pipeline detection.

3.2.2. Pipeline Defect Classification Based on Deep Learning

Regardless of the type of task, CNN has obvious advantages over traditional machine learning methods in data processing. As the developed and mature task type, the classification is the core of all subsequent tasks, and the breakthrough of this task is largely due to the continuous expansion and improvement of the dataset. Kumar [3] used the AlexNet network based on image classification to classify the three types of defects of root invasion, sediment, and crack in 12,000 drainage pipeline images, and the average test accuracy reached 86.2% (Figure 6(a)). With the deepening of the network structure and the adjustment of parameters, the accuracy of pipeline defect detection is gradually improved. After AlexNet is adjusted by Hassan [6], the number of defects that can be classified has increased to six. The built-in text detection and recognition module can analyze the reports obtained by CCTV detection and accurately display the defect location. In addition to collecting more pipeline damage pictures, obtaining more reliable initial weights through transfer learning, and improving the accuracy of optimization algorithms, there are also studies [59] that try to use StyleGAN v2 to generate 1000 pictures for each defect category, and the number of images increases by 8.96 times. The fusion network after integrating generative adversarial networks (GANs) and CNN effectively expands the dataset and provides a feasible idea for improving the classification accuracy of pipeline defect types. However, it is not enough to only expand the number of datasets. In the process of CCTV detection, the number of normal images is much larger than that of defect images. In view of the imbalance between image types, Li [72] proposed a hierarchical classification of defects on the basis of the original deep network structure, which improved the average accuracy of the training set and the verification set by 4.8%. Different from the above complex network structure, the three-layer convolutional neural network designed by Meijer [34] also realizes the recognition of common defects in pipelines. It is expected that the number of images that need to be manually reviewed can be reduced by 60.5%, which proves that the lightweight convolutional neural network also has a good application scenario in practical engineering, which is particularly important for some time-sensitive pipeline detection tasks or monitoring of some pipeline conditions.

Regardless of the algorithm used, capturing images from CCTV video frame by frame and establishing a large, high-quality, and reliable database remain a crucial prerequisite for continuously optimizing various convolutional network models [2]. Inadequate coverage of defect types within the dataset hinders the establishment of a fully reliable intelligent defect classification system. Moreover, low original image quality impedes the achievement of satisfactory results, whether through data enhancement technology or the generation of an adversarial network to expand the dataset. Currently, there is a scarcity of drainage pipeline defect datasets globally. While Sewer-Ml’s provision of 1.3 million images (https://vap.aau.dk/sewer-ml/), as the first publicly available sewer defect dataset, has significantly contributed to enhancing industry intelligence, it lacks defect type and location annotations. Therefore, it is imperative for local governments and academic teams to enhance collaboration and construct, refine, and publicize representative high-quality datasets, thus laying the groundwork for more challenging target detection and segmentation and for better optimization and completion of defect classification tasks.

3.2.3. Pipe Defect Localization Based on Deep Learning

In addition to classifying pipeline defects, enabling CNNs to perform accurate localization tasks has garnered significant interest due to the high adaptability of target detection neural networks in this domain. Based on the structural features of different networks, target detection can be categorized into two primary types: two-stage and single-stage detection models.

Among two-stage detectors, faster R-CNN has evolved from R-CNN and fast R-CNN, showcasing marked improvements in detection speed and mean average precision (mPA), achieving a rate of 5 frames per second (FPS) on GPUs. This makes it a quintessential network for automatic drainage pipeline defect detection systems [73]. By incorporating the k-means clustering algorithm, faster R-CNN can align prediction boxes more swiftly and precisely with the actual regions of interest, enhancing mAP to 92.4% [74]. However, details regarding the dataset size and training duration remain unspecified [74]. Wang [62] suggests that faster R-CNN is notably proficient in identifying cracks, tree root invasions, and lateral connections. Furthermore, they recommend tracking recurrent defects across different frames in CCTV footage and recognizing multiple defects within the same frame. This approach could enhance defect quantification and pipeline condition assessment. Consensus in research indicates that model accuracy can improve with larger datasets, deeper networks, and smaller filter sizes and convolution strides. Nonetheless, such optimizations can extend the training period and decelerate defect detection speed. The fundamental issues of two-stage models may necessitate structural changes to the network for resolution.

The emergence of SSD and YOLO models has opened possibilities for real-time detection within drainage pipelines. As single-stage object detection frameworks, studies [75] indicate that these models considerably outpace their advanced two-stage counterparts in speed (at three times and twice the rate of faster R-CNN, respectively) albeit at the expense of precision. Notably, these single-stage algorithms have been observed to struggle with higher recall rates for smaller pipeline defects. Beyond the widely implemented attention mechanisms designed to adjust model weights, Shen [76] has explored integrating the RFB module into SSD’s backbone network coupled with a skip-dense connection module (SDCM) for feature fusion. The result is a significant enhancement in detection accuracy, with a recorded mAP of 92.20%. In Tan’s research [70], improvements to the YOLO algorithm have also facilitated comparable levels of accuracy, as detailed in the training process and outcomes depicted in Figure 6(b).

In order to further advance the automation of pipeline inspection, it is essential to extract contextual semantic information directly from data-rich media such as videos and ultimately produce text reports that include the types and locations of defects. Yin [7, 77] designed a defect detector based on YOLO v3, which, upon training with a vast number of datasets, led to the development of the novel video interpretation algorithm for sewer pipes (VIASP). This detection system is suited for routine sewer pipeline assessments. Nevertheless, because the accuracy of the defect detector could be improved, this set of algorithms still requires the involvement of technical personnel. A separate study [78] incorporated the extra spatial pyramid pooling (SPP) module into the YOLO v4 algorithm to capture image features at multiple scales, thus enhancing the model’s receptive field and computational efficiency. Although achieving a mAP comparable to the previous study [76], the recall rate increased to 89.0%. Presently, the YOLO algorithm continues to evolve, with recent research focusing on the open-sourcing and enhancement of more sophisticated network models such as higher versions of YOLO (e.g., YOLO v7 [79] and YOLO v8 [80]) and the DETR [81] (detection transformer), attempting to apply them for real-time detection of various concrete structure damages. With the ongoing development of deep learning network models and growing engineering experience, it is expected that the identification and positioning of drainage pipeline defects will become increasingly intelligent.

3.2.4. Pipeline Damage Characterization Based on Deep Learning

If the task of target detection is to ascertain the count, positioning, and spatial interrelations of internal pipeline defects with precision and efficiency, then the goal of pixel-level segmentation network models is to harness the full potential of every pixel within the dataset’s images. They aim to delineate the extent of pipeline deterioration with precision, guided by the contours and dimensions of the masks produced by the algorithm.

The open-source semantic segmentation model is an early and widely utilized approach, based on deep learning technology, to precisely characterize the extent of pipeline damage. Various well-established semantic segmentation models, such as FCN, SegNet, U-Net, and DeepLab3, are capable of performing pixel-level segmentation tasks using CNN models with encoder-decoder structures [82]. However, it is still a challenging task to use image recognition to accurately calibrate pipeline defects, because, with the increase of network layers, the characteristics of some small defects are often ignored by neural networks in the process of convolution and pooling. Especially in the context of pixel-level segmentation tasks, the challenge of accurately extracting these minuscule defects can lead to erroneous evaluations of damage severity. In order to evaluate the segmentation performance of different feature extractors and network models more comprehensively, the evaluation indexes of network models are becoming more and more complex. For example, the mean pixel accuracy (mPA) is the optimization of the simplest indicator of pixel accuracy (PA), and the frequency-weighted intersection over union (FWIoU) is based on the average intersection over union for each class. Zhou [67] compared the DeepLabv3+ network with different embedded feature extractors and tested the performance of other network structures to detect environmental defects in complex drainage pipelines. The experiment compared multiple feature extractors and concluded that D-ResNet50 performed best in other indicators except mPA. Compared with other network structures such as U-Net, DeepLabv3+ has obvious advantages as a backbone network structure. In a more comprehensive model performance evaluation system, the confusion matrix [67] and ablation study [83] shown in Figure 7 can help scholars understand the classification accuracy and error of the model in different categories and evaluate the importance of each component in the system. Therefore, it is still the focus of scholars’ research to optimize and compare the segmentation effects on the basis of mature network structures. Pan [71] pointed out that after adding feature reuse and attention mechanism blocks between original skip connections of U-Net, the new model can obtain images with high accuracy at a considerable speed, but in the face of multitarget segmentation tasks, the segmentation of edges between some defects is not clear (as shown in Figure 6(c)), resulting in a recall rate of only 47.37%. After abstracting the defect features as masks, the difficulty of characterizing the severity of sewage pipeline defects is greatly reduced. However, as various segmentation models continue to mature, people begin to have demands for pixel-level segmentation of multiple targets of the same type of object. Improving the accuracy and detection speed of the model while completing the detection of multiple targets has become an important goal to improve the service operation and maintenance of drainage pipelines and the decision-making ability of urban water environment management.

The task of instance segmentation can be seen as a combination of object detection and semantic segmentation tasks. Mask R-CNN derived from fast R-CNN has been applied to accurately extract defects from drainage pipeline images or video data. As a two-stage network, mask R-CNN can input the generated feature map into the region proposal network (RPN) to generate the region of interest (ROI) and realize the instance segmentation of different defect types through its mask branch. After adjustment and optimization, the mPA can reach 86.89%. Compared with other models trained in the same period, this accuracy is satisfactory [84]. In another study based on the network model, Xu [23] used the self-developed pipeline robot to collect data and realized three subtasks of instance segmentation, real-time sewer detection equipment positioning, and real three-dimensional (3D) model reconstruction in a drainage pipeline evaluation for the first time. Their verification experiments show that the mPA can reach 92.7% when the recall rate reaches 50%, but the maximum measurement error of the 3D model can reach 1m, and the mPA for small target objects is only 42.2%. It is worth mentioning that the paper discloses the data, source code, and training weight used in the research (https://github.com/fangxu622/Sewer-Detection), which makes a good demonstration for the development of communication and cooperation in this field.

In addition to the two-stage model, SOLO inspired by YOLO is one of the typical single-stage models in instance segmentation and has been gradually applied to pipeline detection in recent years. Li [68] made an optimization based on SOLOv2 and called the new defect detection model Pipe-SOLO. The experimental data show that the mPA of the model can also reach 59.3%, and the performance on the dataset used in Li’s experiment is the best (see Figure 6(d)), which proves the feasibility of applying the single-stage model with simpler structure to pipeline detection. In another study, Ma continued and further improved his ideas in the previous defect classification study [59], combined with the attention mechanism dehazing algorithm and the adversarial neural network deblurring algorithm, and proposed a real-time segmentation network Pipe-Yolact-Edge. The highest mPA can reach an astonishing 92.65% [85]. This team’s recent research [86] has begun to try to use 3D point clouds to realize the three-dimensional modeling of the pipeline and further promote the establishment of the inspection and evaluation system for pipeline defects.

However, it is necessary to point out that due to the differences in the training process, parameter weights, and evaluation indicators between different research groups, and the underlying logic of the algorithm itself is not the same, it is unreasonable to judge the upper and lower relationships between the segmentation models and their feature extraction algorithms only by comparing the indicators. It is still necessary to make judgments and choices based on the dataset used in the training stage.

While the remarkable performance of deep learning technology in drainage pipeline detection is certainly remarkable, a critical examination of current research reveals three major shortcomings in the automation and intelligence of pipeline defect classification, detection, and characterization closely linked to convolutional neural networks.(1)The accuracy of the positioning system is highly correlated with the buried depth and geographical location of the pipeline. It is difficult for the detection data in the harsh working environment to distinguish whether the defects detected successively come from the same part.(2)The existing labeled datasets of pipeline damage are scarce, and the problems of model over-fitting and low robustness cannot be fundamentally solved only through the innovation of neural network models, algorithm optimization, and data enhancement.(3)It is challenging to make significant advancements in the positioning and accurate recognition of defects by relying on visual sensors alone to capture the internal defect information of pipelines. Improving intelligent defect recognition and positioning technology following multisensor data fusion are essential to establish the data analysis foundation for pipeline defect detection, thus overcoming the challenges associated with the quantitative analysis of defect levels based solely on image data.

4. Intelligent Diagnosis of Underground Concrete Drainage Pipeline

4.1. Damage Measurement and Evaluation of Drainage Pipeline under Mathematical Intelligence

As theoretical models become increasingly enriched and detection equipment advances, and computing power improves, intelligent diagnosis based on pipeline damage images has become one of the hot research topics. The ideal scenario involves the integration of information collection, image recognition, defect classification, and location, structural damage assessment, and operation and maintenance recommendations. The diagnostic process is outlined in Figure 8.

The quantification of pipeline damage through deep learning involves measuring the physical and geometric dimensions of defects, which is the initial stage in assessing pipeline damage. This process heavily relies on the development of pixel-level segmentation algorithms, notably the instance segmentation network model. Wang [87] has developed a model that is robust against complex environmental factors such as low light and intense brightness. This model utilizes automatically generated mask sizes to recognize various damaged areas, achieving a maximum pixel-level measurement error of within 11%. Furthermore, the method of quantifying the severity of sewer pipeline defects through the ratio of masked pixel count to the total pixel count in the image has been demonstrated to be viable, starting from pixel segmentation [67, 81, 88]. Particularly, the evaluation criteria for damage caused by different defects (such as cracks, corrosion, and tree roots) need to be distinguished. Otherwise, the damage rating of the crack mask will be far less serious than that of the large area mask such as deposition and corrosion, and the risk of small and deep cracks will be underestimated or even ignored in pipeline diagnosis, which is obviously unscientific. There are many types of pipeline damage, and it is unreasonable to pay no attention to or only pay attention to a single type of damage. The size quantification of apparent defects in concrete structures started earlier. The measurement methods of macroscopic damage physical and geometric parameters such as cracks, shedding, and exposed bars are relatively mature, and their errors can meet the engineering needs [89]. In recent studies, Liu [90] tried to use the fractal information of cracks to analyze the apparent damage of concrete members, and the overall accuracy of the predicted potential damage level of the structure can reach 91.67%. Nonetheless, in pipeline diagnostics, the correlation between the intelligent prediction-derived mask size and the actual physical and geometric dimensions remains ambiguous. This is primarily due to uncertainties in the detection distance during pipeline inspection, as well as the need to account for spatial relationships when evaluating the size of damage to the annular pipeline depicted in two-dimensional images [86]. Efforts to establish an efficient and accurate three-dimensional model of the pipeline [23, 91] are still imperative to furnish more dependable foundational knowledge for the assessment of pipeline defects.

It is also an effective method to detect pipelines by mimicking other human sensory systems, such as the use of tactile means, rather than relying solely on machine vision. In addressing the issue of microbial corrosion in pipelines, a sensor based on drill resistance [92] was developed to precisely measure the depth of microbially corroded concrete layers. Experiments demonstrated that the measurement accuracy can achieve millimeter-level precision, providing valuable additional information for evaluating the condition of concrete in drainage pipelines. Additionally, Ross [93] and Richard Hall [94] designed a remote-operated robot and an extensible probe to measure the depth and width of cracks on the surface of the pipe and the protrusions and depressions on the pipe wall, utilizing the displacement probe. Various types of defects in the concrete pipe were successfully identified and quantified, and the impact of tip shape on concrete assessment was studied. These experiments illustrated that the method exhibits high accuracy and repeatability, representing an effective exploration of new detection methods.

The efficient and accurate evaluation of the degree of pipeline damage based on the test results is the cornerstone of establishing an intelligent diagnosis system for drainage pipelines. Previous research [6] has demonstrated the effectiveness of integrating text information from CCTV detection into the pipeline intelligent evaluation system to enrich the pipeline damage data, such as geographical location and time information. Recent studies [95] have started analyzing text reports generated by CCTV detection and grading pipeline defects using natural language processing (NLP) models, as shown in Figure 9. The defect assessment model can classify the defect level from 1 to 5 with an accuracy rate of over 92%. However, the decision-making process solely based on text information such as word meaning and frequency greatly relies on the definition of defects in the specification. This approach does not propose an evaluation theory suitable for intelligent technology, and the final conclusion does not consider the physical damage of the actual pipe section and has the limitation that it can only make qualitative judgments, suggesting the need for further discussion and improvement. As early as 2000, PIPAT, the most advanced quantitative evaluation system for underground pipelines at the time, was able to automatically detect, classify, evaluate defects, and generate report samples [96]. With the advancement of robotics, it is expected that the threshold and cost of pipeline evaluation systems will continue to decrease, and integrated advanced technology will become more diversified. The detection data for pipeline sections will become more comprehensive, leading to increasingly reliable intelligent diagnosis results for the entire pipeline. These are essential for achieving intelligent prediction and operation and maintenance of urban drainage pipeline network damage.

4.2. Damage Prediction and Operation and Maintenance of Drainage Pipe Network Driven by Data Intelligence

After acquiring a substantial amount of pertinent information on the pipe section’s defect status, the next step involves predicting the subsequent service status of the regional drainage pipeline and even the urban pipe network and providing a rational and precise assessment. Existing theories for predicting and evaluating drainage pipeline damage can be categorized into three types: physical model, statistical model, and artificial intelligence model. The first two methods are grounded in complex physical and mathematical theories, essentially originating from simplified deterministic and mathematical statistics models, and possessing a profound theoretical basis. However, these traditional methods heavily rely on the subjective judgment of evaluators, resulting in inconsistencies in the evaluation indices of pipeline deterioration, and the resulting models take various forms, making it challenging to completely replicate the experimental data. Furthermore, as the input and output variables increase, the time and capital costs of deepening the theoretical model to meet testing requirements will escalate [97]. Fueled by the power of big data and artificial intelligence, the third model has garnered increasing attention from researchers. In this approach, data mining tasks are entrusted to sophisticated intelligent models, which leverage various algorithms to learn from existing data. The goal is to utilize the internal logical relationships between numerous independent variables (i.e., various influencing factors) and dependent variables (i.e., the degree of deterioration and damage of pipelines) to predict potential damage levels. Among the artificial intelligence techniques propelled by advanced intelligence, artificial neural networks, fuzzy logic, and a variety of machine learning algorithms [98] have proven to be more mature and widely used in the operation and maintenance of drainage pipelines. Figure 10 illustrates their operational mechanisms and the pipeline damage prediction and evaluation process.

Artificial neural network (ANN) is characterized by its multilayer network structure. It is an engineering technology capable of resolving nonlinear relationships within a specified range of data. In recent years, it has found application among numerous scholars for monitoring and predicting the operational and maintenance aspects of engineering structures due to its effective data mining and modeling capabilities for complex one-dimensional systems [102]. With over 30 years of development, ANN has sparked a research surge in robust structural damage models with high-level generalization capabilities. The neural network, with the ability to adapt to complex relationships without the need for clear physical specifications, emphasizes the significance of multiple neurons corresponding to external input information within the initial layer. Its application in complex pipeline service and operational issues is influenced by various factors including pipeline age, location, scale, climatic conditions, and buried depth, all of which hold varying degrees of importance. Some studies indicate that factors such as pipeline size, tree count, and slope outweigh the importance of soil and buried depth [103]. Different network types boast unique advantages and disadvantages in prediction capabilities. The back-propagation (BP) neural network optimized by genetic algorithms and the probabilistic neural network (PNN) are used for predicting pipeline defect levels, respectively [104]. While the former excels, it demands additional time and energy input for determining hidden neuron count and initial weight values and addressing local optimization issues. Conversely, the latter boasts a relatively simple prediction model, but its performance hinges closely on the training method, leading to a higher likelihood of incorrect conclusions. In addition, different optimization algorithms will also affect the prediction ability of the neural network. Ebtehaj [99, 105] compared the performance of neural networks optimized by different algorithms in predicting the service performance of drainage pipelines and found that the accuracy of the independent component analysis (ICA) algorithm slightly outweighed that of the genetic algorithm. Embedding the radial basis function (RBF) network, the back-propagation (BP) algorithm, which demonstrated strong performance previously, marginally underperformed the particle swarm optimization (PSO) algorithm. Nonetheless, the superiority of these algorithms lies in their training on specific datasets, rather than in the existence of a hierarchical substitution between them. Consequently, establishing more objective evaluation indicators and comparing the performance of various neural networks in pipeline prediction remain a crucial contemporary topic.

Fuzzy logic exhibits some characteristics that are opposite to those of neural networks. In contrast to the mapping of internal relationships of data through the calculation of neurons and weights, this approach translates uncertainty and fuzziness into mathematical concepts based on fuzzy theory and subsequently maps the results back to fuzzy concepts through antifuzzification. Accordingly, it does not rely heavily on large-scale dataset training, provides strong model interpretation, and is easily integrated with other advanced technologies. It has become increasingly common in pipeline risk assessment methods in recent years. Thanks to the flexibility and adaptability of fuzzy theory, this method can not only use binary logic to evaluate its contribution to the corrosion rate of pipelines based on soil corrosiveness [100] but also combine with GIS technology to consider multiple input information. Fuzzy modeling of underground pipelines in mining areas can accurately identify pipes with higher risk ratings as red [106]. As a core algorithm embedded in other multicriteria decision-making, it can achieve a compatibility of 0.96 [107], which holds theoretical significance and engineering value for the improvement of urban waterlogging prevention and control systems. However, this method heavily relies on the experience and knowledge of domain experts to construct fuzzy sets and fuzzy rules, making it challenging to ensure the accuracy and comprehensiveness of the process. Some scholars have pointed out that relying on this method alone may lead to the emergence of numerous likelihood assumptions [108], which could hinder the further improvement of its evaluation effect.

Machine learning, as a broad set of concepts, encompasses not only the two data prediction methods mentioned above but also incorporates a diverse array of training strategies, including supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning [109]. Among them, there are excellent algorithms without a multilayer network structure, or without the introduction of fuzzy logic to assist in decision-making, or which organically combine these two methods (such as integrating them into a cloud-based data-driven system [110]). The prediction accuracy is outstanding and holds promising prospects in the intelligent evaluation of drainage pipelines. Within the realm of mature machine learning algorithms, well-known algorithms such as decision tree [111], support vector machine [112], and random forest [113] stand out. As the evaluation indices of the algorithm continue to expand, scholars have begun to intuitively assess the correlation of the model with certain types of samples through the confusion matrix. Through repeated tests, the input parameters can become more refined. According to reports [114], compared with open joints and holes, sewer leakage is more easily controlled by cracks and is highly correlated with the defect area of sewage pipelines. Some studies have started to introduce new concepts such as hydraulic fingerprinting [2, 115] to further enhance the evaluation system of urban drainage networks. The evaluation of pipeline service performance has also shifted from a simple rating based on specifications to the accurate prediction of more specific performance criteria such as pipeline sediment transport [116], silt deposition [117], and leakage detection and remediation costs [118]. These criteria directly or indirectly impact the resilience of the pipeline under normal service and extreme disasters. The improved prediction accuracy strengthens the foundation of pipeline diagnosis. Moreover, some algorithms and frameworks with superior performance have been developed and applied to pipeline detection, urban flood control, and disaster reduction. As proof, XGboost [101] and LightGBM [119] can rapidly generate high precision and highly robust predictions, while also quantifying the importance of each input indicator. In a similar vein, Di [117] integrated a GAN and a CNN containing two modules, object detection and semantic segmentation, according to the method as shown in Figure 11, to achieve dynamic measurement of water flow in the pipeline. Two data mining models (SSAE-TSNE for extracting fault features and MS-LSTM for processing regression problems related to long time series) were integrated to construct a diagnostic system for pipeline sediment deposition. It is evident that the nesting of various algorithms and the enhancement or development of different intelligent diagnostic models can significantly optimize traditional blind and mechanical maintenance methods and improve the accuracy of prediction models, ultimately curbing pipeline degradation and reducing management costs. Moreover, this provides theoretical support for the efficient operation and maintenance of urban underground pipelines. Notably, the aforementioned techniques may suffer from overfitting when detached from large sample data, resulting in a significant decrease in model generalization ability. The optimal network model established through parameter adjustments heavily relies on the accumulation of prior knowledge and repetitive experiments by the designer. The complex and tedious process is difficult to standardize, making it a key challenge to overcome for precise predictions based on data intelligence.

Nowadays, intelligent predictive analysis for pipeline damage has offered robust data support for subsequent assessments of pipeline service status. The integration of these technologies has led to significant advancements in urban intelligent water systems. However, there are still some limitations that need to be addressed in the future.(1)When utilizing modern technology for diagnosis, the assessment report provided by drainage pipeline experts is considered as the foundational information, and the outcomes generated by the computer are compared with the results for hyperparameter optimization. Nevertheless, the accuracy of this report itself is not guaranteed due to factors such as fatigue during operation, varying levels of experience, subjective judgment, and others.(2)The accuracy of any intelligent methods cannot reach 100%, and there is no research available to provide risk management for incorrect intelligent assessment results.(3)Owing to variations in climate and terrain across different regions, achieving a uniform proportion of damage categories is challenging, and the degree of deterioration varies significantly. Additional discussion is required to determine if the dataset used to train a specific evaluation system yields comparable results in different regions.

Given that current intelligent diagnostic technology is still in its early stages, the aforementioned limitations can be overcome through additional exploration and research. With the advancement of computing power and algorithms in the era of artificial intelligence, the operation and maintenance system of urban pipeline networks has been consistently enhanced. The further integration of deep learning technology and pipeline diagnosis will significantly improve the well-being of urban residents, reduce inefficient human input, and enhance the economic benefits of urban intelligent operation and maintenance.

5. Conclusion

As artificial intelligence technology continues to advance progressively, the detection and evaluation of urban underground pipe networks is transitioning from mere “observation” to informed “decision-making.” The integration of drainage pipeline detection and artificial intelligence will significantly contribute to the overall life cycle operation and maintenance of urban drainage systems. The automatic completion of defect identification, damage assessment, and disease warning of drainage pipelines using information technology remains a focal point in the development of pipeline diagnosis technology. This paper systematically elucidates the current technology of underground concrete drainage pipeline diagnosis and outlines the application status and extensive prospects of artificial intelligence in urban drainage pipeline diagnosis. The following conclusions are derived:(1)The traditional pipeline diagnosis method has entered the semiautomatic stage from the pure manual detection stage. The integration of CCTV robots, periscopes, sonar, and other technologies has led to a more mature detection process. Nevertheless, within the traditional mode, the allocation of human and material resources does not align with diagnostic efficiency, and the high-intensity, high-risk work environment is not in line with the development concept of the era of artificial intelligence.(2)The application of artificial intelligence in the diagnosis of drainage pipelines has emerged as a new paradigm driven by mathematical intelligence. New theories, equipment, and intelligent assessment methods for evaluating the health status of pipelines have been proposed, addressing the ambiguity and subjectivity of traditional assessment methods. Artificial neural networks and deep learning have demonstrated significant advantages in processing data across various dimensions.(3)The computer vision technology utilizing convolutional neural network architecture has garnered significant attention. While many research teams have focused on intelligent research for pipeline diagnosis based on image information, there has been less consideration for the diversification of other deep learning network types and information acquisition channels. Integrated development, collaboration, and the exchange of technology frontiers across different fields are poised to become crucial avenues for breaking information barriers and advancing the intelligent diagnosis of urban underground concrete drainage networks.

Throughout the development process of urban underground concrete drainage pipeline diagnosis, it obviously shows the evolution trend of detection methods from traditional to intelligent, evaluation criteria from qualitative to quantitative, and diagnosis process from manual to automatic. Looking forward, the development of this field may focus on the research and development of new pipeline detection equipment, the reduction of the threshold for the use of intelligent technology, the expansion of the types of data sources in the drainage network, and the improvement of the corresponding database. While realizing the technical iteration from picture to video, two-dimensional to three-dimensional, visual to perceptual, combined with the cutting-edge results of different professions, breaking away from the information silos between different disciplines, a new specification that can adapt to advanced pipeline detection methods and accurately evaluate the degree of deterioration of pipe sections may be compiled.

Data Availability

The datasets used to support the findings of this study are available from the corresponding author on request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The research presented was financially supported by the National Natural Science Foundation of China (Grant no. 52278217) and the Key R&D Program Projects in Shaanxi Province (Grant no. 2024SF-YBXM-670).