Abstract

Objective. The detection of epidermal growth factor receptor (EGFR) mutation and programmed death ligand-1 (PD-L1) expression status is crucial to determine the treatment strategies for patients with non-small-cell lung cancer (NSCLC). Recently, the rapid development of radiomics including but not limited to deep learning techniques has indicated the potential role of medical images in the diagnosis and treatment of diseases. Methods. Eligible patients diagnosed/treated at the West China Hospital of Sichuan University from January 2013 to April 2019 were identified retrospectively. The preoperative CT images were obtained, as well as the gene status regarding EGFR mutation and PD-L1 expression. Tumor region of interest (ROI) was delineated manually by experienced respiratory specialists. We used 3D convolutional neural network (CNN) with ROI information as input to construct a classification model and established a prognostic model combining deep learning features and clinical features to stratify survival risk of lung cancer patients. Results. The whole cohort (N = 1262) was divided into a training set (N = 882, 70%), validation set (N = 125, 10%), and test set (N = 255, 20%). We used a 3D convolutional neural network (CNN) to construct a prediction model, with AUCs of 0.96 (95% CI: 0.94–0.98), 0.80 (95% CI: 0.72–0.88), and 0.73 (95% CI: 0.63–0.83) in the training, validation, and test cohorts, respectively. The combined prognostic model showed a good performance on survival prediction in NSCLC patients (C-index: 0.71). Conclusion. In this study, a noninvasive and effective model was proposed to predict EGFR mutation and PD-L1 expression status as a clinical decision support tool. Additionally, the combination of deep learning features with clinical features demonstrated great stratification capabilities in the prognostic model. Our team would continue to explore the application of imaging markers for treatment selection of lung cancer patients.

1. Introduction

Lung cancer is the leading cause of cancer-related deaths and the second most commonly diagnosed cancer around the world, with around 1.8 million deaths and 2.2 million new cancer cases in 2020 [1]. Non-small-cell lung cancer (NSCLC) is the most common subtype of lung cancer, and the 5-year survival rate is less than 20%. The emergence of targeted therapy and immunotherapy has revolutionized the treatment of lung cancer and improved clinical outcomes among a subset of patients [2, 3]. Tyrosine kinase inhibitors (TKIs) targeted to epidermal growth factor receptor (EGFR) could lead to extend progression-free survival (PFS) compared with conventional chemotherapy in EGFR-mutated NSCLC patients [4, 5]. Simultaneously, immune checkpoint inhibitors (ICIs) targeted to the programmed death ligand-1 (PD-L1) expressed by tumor cells would also contribute to prolonged overall survival (OS) in PD-L1-positive patients with advanced NSCLC [6, 7]. Therefore, it is extremely essential to identify the genetic status of patients in the era of precision medicine.

At present, molecular genetic testing based on tumor tissue specimens is the gold standard to determine the genetic status. However, the common methods to obtain these tissue specimens, such as surgery or biopsy, are invasive, expensive, and slow, and tumor tissue varies in regard to time and space. In addition, other limitations including, but not limited to, the difficulty to obtain materials, the potential requirement for a secondary biopsy, and poor DNA quality can delay subsequent treatment decisions [8, 9]. Therefore, a noninvasive, convenient, and efficient method to predict genetic status is of imminent need.

As an effective screening tool of lung cancer, computed tomography (CT) can effectively reduce the mortality of lung cancer with early detection and is, thus, widely used in clinical examinations [10, 11]. In the past decade, radiomic methods, especially the deep learning technology, have unearthed high-throughput information in medical images [12]. Deep learning achieved a favorable performance in detecting lymph node metastases in breast cancer and estimating malignancy risk in lung cancer and diagnosing quickly in the COVID-19 pandemic [1315]. Previous studies have shown that the features extracted from CT images of lung cancer cases might be related to gene expression patterns [16, 17]. Wang et al. used an end-to-end deep learning model to dissect CT images and to predict EGFR mutation status [18]. Tian et al. provided a deep learning model to predict high PD-L1 expression of NSCLC and to infer clinical outcomes in response to immunotherapy [19]. Based on these former explorations, to better satisfy the needs of clinical practice, there still is need for explorations of multigene expression using deep learning techniques.

Herein, we proposed a new approach to predict EGFR mutation and PD-L1 expression status in NSCLC patients based on deep learning technology and selected features to build a prognostic model. This noninvasive and easy-to-use method would assist clinicians in making treatment decisions for patients.

2. Materials and Methods

2.1. Data Acquisition and Processing

Eligible patients diagnosed/treated at the West China Hospital of Sichuan University from January 2013 to April 2019 were identified retrospectively. The inclusion criteria of patients were as follows: (1) pathologically diagnosed with primary NSCLC; (2) tested EGFR mutation and PD-L1 expression status; and (3) available CT images within 1 month before pathological diagnosis. The exclusion criteria of patients were as follows: (1) missing critical clinical data; (2) without genetic testing or having failed in the tests with poor tissue quality; and (3) without chest CT examination, or with CT images where the lesion was hard to distinguish and annotate, like being adhered to the hilar or caused atelectasis.

In total, 1262 patients were collected for this study and divided into a training set (N = 882), validation set (N = 125), and test set (N = 255) with a ratio of 7 : 1 : 2. Then demographic information (age, sex, and smoking history), histopathology reports, therapy (targeted therapy, ICIs), and gene testing reports were collected from the hospital information system. Thin-layer (1–3 mm) CT scanning images from multiple scanners (GE, Philips, Siemens United Imaging Health) were collected. Our follow-ups for all patients ended in April 2021. Ethics approval was obtained from the ethics committee of West China Hospital, Sichuan University.

We collected tumor specimens through biopsy or surgical resection. Then, EGFR mutation status was determined by Amplification Refractory Mutation System-Polymerase Chain Reaction (ARMS-PCR) or next-generation sequencing (NGS). PD-L1 expression status was detected using SP142 antibody in immunohistochemical (IHC) assays performed on the Ventana Benchmark platform. After being reviewed by senior pathologists, the testing results of these genes were regarded as the gold criteria in the current study.

2.2. Development of the Deep Learning Model

The chest CT images were taken with standard parameters and stored in DICOM format. Tumor region of interest (ROI) was delineated manually by experienced respiratory medicine specialists and then adjusted to 48 × 224 × 224 pixels from original lung window images with the original centers. The details of the adjustment were as follows: if the original scale of ROI was larger than 48 × 224 × 224 pixels, the exceeding part was cut; by contrast, if the original scale of ROI was less than 48 × 224 × 224 pixels, baseline values would be filled to standardize the size of the region. These 48 × 224 × 224 ROIs were then used to develop our deep learning model, during which they were divided into training, validation, and test sets with a ratio of 7 : 1 : 2 counting by individual patient. In regard to the genetic features, ROIs were categorized into four categories: double-negative, EGFR(−) but PD-L1(+), EGFR(+) but PD-L1(−), and double-positive.

As the previous literature suggested, residual block could relieve the gradient disappearance problem caused by the depth of neural network, and three-dimensional residual network showed a good performance on not only natural images [20] but also medical images [2124]. In the current study, considering the format of CT scans, we constructed a 3D convolutional neural network (CNN) model for classifying the EGFR and PD-L1 status. In Figure 1, the architecture of our CNN network is shown. More details of layers were presented in Supplement materials (Table S1 and Table S2). Additionally, the Gradient-weighted class activation mapping (CAM) was utilized to localize and visualize the important regions in the input images for predicting the target concept.

During the training process, the batch size of every training epoch was 16. Also, the model with the best performance on the validation dataset was selected for further testing.

2.3. Development of the Prognostic Model

The CNN extracted 512-dimensional deep learning features of patients in the training set. Next, the least absolute shrinkage and selection operator (LASSO) method, commonly applicable for the regression of high-dimensional data [25], was used based on glmnet package. We used the “multinomial” option to adapt to multiclass datasets and changed the value of the regularization parameter lambda to adjust the LASSO model. In order to prevent overfitting, 5-fold cross validation was used to resample the training set. The model with the smallest misclassification error was selected as the optimal model, which contained the best feature set.

Then, a prognostic model combining deep learning (DL) features and clinical features (age, sex, smoking history, EGFR-TKI targeted therapy, and ICI therapy) was generated to divide the patients into high-risk and low-risk groups according to the cutoff value based on survminer package. The performance of this model was evaluated in the validation set and the test set. In addition, we also constructed a clinical prognostic model for comparison in regard to sex, age, and smoking, with targeted therapy and ICI therapy.

2.4. Statistical Analysis

The ANOVA test and chi-square test were used to evaluate the difference between continuous variables and categorical variables in the basic data, respectively. CNN, one of the most important deep learning models, was used to construct the prediction model. In developing the prognostic model, we reduced dimensions by using LASSO and compare variables in Cox proportional hazards regression and the log-rank-test. Tumor ROI was outlined with ITK-SNAP software. All analyses were conducted with R 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria) and Python 3.10 (Python Software Foundation). Two-sided values of <0.05 were regarded as statistically significant.

3. Results

3.1. Clinical Characteristics

The clinical characteristics of 1262 patients are shown in Table 1. The median age at diagnosis was 57.70 ± 10.50 years. 49.13% of patients were male. 59.35% of patients never smoked. The numbers of people in the four gene expression groups of double-negative, EGFR(−) but PD-L1(+), EGFR(+) but PD-L1(−), and double-positive were 276 (1.87%), 290 (22.98%), 502 (39.78%), and 194 (15.37%), respectively. As for the novel treatment strategies, 391 patients in the dataset had EGFR-TKI-targeted treatment while 15 received ICI. 41.91% of patients were diagnosed as stage I. The median follow-up was 31 (95%CI: 30–31) months, and the median overall survival (OS) was 44 (95%CI: 42–49) months. There was no significant difference among the training, validation, and test cohorts regarding age , sex , smoking , gene mutation status , EGFR-TKI-targeted therapy , ICI therapy , histopathology , tumor stage , median follow-up time , and overall survival .

3.2. Prediction Model Performance

Table 2 lists the predictive performance of the deep learning model evaluated with area under the ROC curve (AUC), accuracy, sensitivity, and specificity in training, validation, and test sets. The macro-average AUCs were 0.96 (95% CI: 0.94–0.98), 0.80 (95% CI: 0.72–0.88), and 0.73 (95% CI: 0.63–0.83) in the training, validation, and test cohorts, respectively. AUC of either gene classification was greater than 0.95 in the training set and greater than 0.65 in the test set (Figure 2). Our prediction system achieved an accuracy of 0.90 (95% CI: 0.86–0.93), a sensitivity of 0.74 (95% CI: 0.31–1), and a specificity of 0.93 (95% CI: 0.79–1) in the training set for the overall four-way classification. In Figure 3, as the confusion matrix of different datasets showed, most errors occurred in the adjacent groups. The deep learning model generated an attention map through CAM indicating the importance of each part in the tumor, and the dark areas might be the tissue between the tumor and the hilum (Figure 4).

3.3. Prognostic Model Performance

We built a clinical prognostic model based on several clinical features (Table S3). The C-index was 0.64 (95%CI:0.60–0.68). Then, we combined the 8 deep learning features from the softmax layer with the clinical features to build a new prognostic model, with a C-index of 0.71 (95%CI: 0.68–0.74). This prognostic model successfully stratified patients into high-risk and low-risk groups in regard to the risk of poor prognosis (death) (Figure 5). There was a significant difference in the overall survival (OS) between these groups ( both in training and test sets).

4. Discussion

Rapid determination of gene mutation status is crucial for the therapy decision, especially for patients who are potentially suitable for EGFR-TKI or ICI treatment. In this study, a rapid approach using deep learning based on CT images was proposed to predict EGFR mutation and PD-L1 expression status in NSCLC, with AUCs of 0.96, 0.76, and 0.76 in the training, validation, and test cohorts. Patients with positive mutation might be likely to benefit from TKI and/or ICI treatments, while patients with double-negative mutation can barely present a promising response to these treatment strategies and should adopt other therapies as soon as possible. Furthermore, the predictive model was further developed to stratify patients based on an evaluation of their risk of poor prognosis, potentially serving as a critical clinical reference.

In the field of lung cancer, radiomics has developed rapidly due to the availability of chest CT and the integration of artificial intelligence (AI) [26]. On the one hand, chest CT examination is the most routine detection method in the diagnosis and treatment process of lung cancer, since it is noninvasive, convenient, and easy to obtain in routine clinical workflow. Almost all NSCLC patients would undergo multiple CT examinations in order to track the progression of tumor lesions. On the other hand, in recent years, AI technology, especially deep learning, has been widely applied to the interpretation of medical images. Deep learning technology holds endless potential for lung cancer screening, diagnosis, and treatment, from the detection of lung nodules to identify benign and malignant lung nodules and further subtype classification [14, 27, 28].

A great deal of attention has been paid to studies that combine genomics and radiomics. In the era of precision medicine, there is a trend that patients with lung cancer are treated only after having their gene expression clarified. Some previous studies have used deep learning technology to predict EGFR, PD-L1, or ALK gene status, respectively, and achieved favorable performances (Table 3) [18, 19, 2934]. However, these previous studies focused specifically on predicting mutation status in only one gene. Also, the current study has been the first study to predict the status of EGFR mutation and PD-L1 expression simultaneously using chest CT scans from the so-far largest cohort. At the same time, the CAM method that we utilized in this study visualized the prediction model and improved understanding of deep learning, which once was referred to as the “black box.” Another advantage of our model was that we had input 3D images, which might account for the fact that the fusion prognosis model performed better than the simple clinical model and the 3D results could fully display the characteristics of the lesion and provide more abunant image information.

More and more studies have demonstrated that image features can predict gene status and treatment response and might assist clinical practice in the future [35, 36]. Still, there were several details to be addressed. For example, when the output layer of this model was set to two categories, we got two models to predict EGFR mutation and PD-L1 expression status separately. In spite of these models’ ability to achieve the research goal, they showed instability in their performance. Some studies have suggested the correlation between EGFR and PD-L1 expression [37], which may explain the stability of the four-class model in this study. Although the four categories could reflect the relationship between genes, more data from multi-centers should be needed for further improvement of model performance. Therefore, how to build a more clinically practical model will be the focus of our attention in the future.

Our research has several limitations. Firstly, it was a single-center retrospective study, but we would, to some extent, release the problem by testing the generalization and robustness of the model in an external dataset. Secondly, we temporarily lacked the assessment of the response to the treatment, which was a concern for targeted therapy and immunotherapy drugs. Thirdly, we mainly focused on two major valuable molecules: EGFR and PD-L1 are tested in the routine clinic practice. But, other genes including, but not limited to, ALK and ROS1 and gene panel are still worth investigating. If deep learning model predicts the wrong gene expression/mutation status, patients would receive the inappropriate treatment. Molecular tests are still needed to double make sure the therapy is secure before AI software will approve. Furthermore, we would try to incorporate a variety of indicators related to prognosis, such as tumor size, volume, shape, ground glass opacity (GGO), or solid components, to optimize the prognostic model in the future.

5. Conclusions

In conclusion, a noninvasive and effective model was proposed to predict EGFR mutation and PD-L1 expression status, which can serve as a clinical decision support tool. Additionally, the combination of deep learning features with clinical features improved stratification capabilities of the prognostic model. Later, our team will further dig deep into the application of imaging markers in the treatment decision for lung cancer patients.

Data Availability

The data used to support the findings of this study are available on request to the corresponding author.

Conflicts of Interest

All authors have no conflicts of interest.

Authors’ Contributions

WL and ZY were involved in the study design. CW, JS, and XX were involved in the organization of the entire project, data analysis with a clinical perspective, and manuscript writing. JS and JL collected the imaging and clinical data. KZ, KFZ, and JG were involved in the establishment of the algorithm. All authors contributed to the article and approved the submitted version.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (grant nos. 91859203 and 81871890), the Science and Technology Project of Sichuan (grant 2020YFG0473), Chinese Postdoctoral Science Foundation (2021M692309), Postdoctoral Program of Sichuan University (2021SCU12018), and Postdoctoral Program of West China Hospital, Sichuan University (2020HXBH084).

Supplementary Materials

Table 1: details of residual blocks. Table 2: details of the constructed model. Table 3: prognostic Cox models with and without the DL features. (Supplementary Materials)