- Research
- Open access
- Published:
Liquid biopsy-derived extracellular vesicle protein biomarkers for diagnosis and prognostic assessment of lung squamous cell carcinoma
Cancer Cell International volume 25, Article number: 161 (2025)
Abstract
Background
For patients with nodules detected in imaging that are indeterminate for malignancy, achieving accurate, early, and non-invasive diagnosis of Lung Squamous Cell Carcinoma (LUSC) remains a significant challenge. Therefore, we aimed to establish diagnostic and prognostic models by identifying plasma extracellular vesicles (EVs) associated protein biomarkers specific to LUSC.
Methods
This study employed a novel nanomaterial, NaY, for the enrichment of EVs from plasma. Validation was conducted through transmission electron microscopy, nanoparticle tracking analyses, and Western blotting. Machine learning algorithms were utilized to compute protein biomarkers associated with LUSC and establish a diagnostic model. Additionally, a prognostic prediction model for LUSC was developed using a combination of 101 machine learning algorithms. Risk scoring of patients was performed to explore the underlying reasons for prognostic differences between high and low-risk groups.
Results
The results of three experiments demonstrate that the new nanomaterial NaY effectively enriches EVs from plasma. Analysis of the enriched profile reveals pathways related to glycolysis/gluconeogenesis and carbon metabolism enriched in plasma EVs of LUSC patients. Thirty-eight LSCC-related EV biomarkers were identified, from which five proteins (TUBB3, RPS7, RPLP1, KRT2, and VTN) were selected to establish a diagnostic model distinguishing between benign and LUSC nodules. The diagnostic efficacy of RPS7 and VTN was further validated in independent samples using ELISA experiments. Furthermore, DPYD, GALK1, CDC23, UBE2L3, RHEB, and PSME1 were determined as potential prognostic biomarkers. Subsequently, risk scores were computed for each sample, classifying all patients into high and low-risk groups. Enrichment analysis revealed that EVs from the high-risk group contained proteins promoting cell proliferation and invasion, while those from the low-risk group were enriched in immune-related protein biomarkers.
Conclusions
The novel nanomaterial NaY effectively enriches EVs from plasma. Utilizing plasma EV biomarkers, the diagnostic model demonstrates strong discriminative ability between benign and malignant pulmonary nodules in patients.
Background
Lung squamous cell carcinoma (LUSC), a major histological subtype of non-small cell lung cancer (NSCLC), accounts for approximately 25–30% of all lung cancer cases and predominantly affects older male patients [1,2,3]. Although low-dose computed tomography (LDCT) has improved the early detection of lung cancer, its ability to accurately determine the malignant potential of pulmonary nodules remains limited [4]. Moreover, currently available noninvasive serum biomarkers—including squamous cell carcinoma antigen (SCCA), carcinoembryonic antigen (CEA), Cyfra 21 − 1, and neuron-specific enolase (NSE)—have shown suboptimal diagnostic performance, with areas under the curve (AUCs) ranging from 0.598 to 0.702 [5]. These limitations underscore the critical need for novel, reliable, and noninvasive biomarkers with enhanced diagnostic and prognostic value in patients with LUSC.
Extracellular vesicles (EVs) carry diverse molecular cargo, such as nucleic acids, proteins, and lipids, and play essential roles in intercellular communication and various pathological processes [6]. Recent studies have highlighted their promise as noninvasive biomarkers for cancer diagnosis and prognosis [7]. However, clinically validated EV-based biomarkers specifically for early detection and prognosis of LUSC remain limited.
The aim of this study was to employ a novel nanomaterial, NaY, for extracting EVs from plasma to explore the molecular characteristics of plasma-derived EVs in both LUSC and benign lung disease (BLD) patients. We sought to characterize EV biomarkers associated with LUSC and establish an accurate discriminative diagnostic model for distinguishing between benign and malignant nodules. Additionally, we sought to construct a prognostic model to assess the prognosis of LUSC patients.
Methods
Plasma sample collection
This research received approval from the Ethics Committee of the Cancer Hospital of Peking Union Medical College and the Chinese Academy of Medical Sciences (PUMC&CAMS) and was conducted in strict compliance with the principles outlined in the Declaration of Helsinki.
Peripheral blood samples were obtained from individuals, including those diagnosed with LUSC, pneumonia granuloma patients, and pulmonary tuberculosis granuloma patients. Blood was collected into EDTA tubes and then centrifuged at 1,000 rpm for 20 min at 4 °C to obtain the plasma, which was subsequently stored at -80 °C for further analysis.
Notably, all the plasma samples utilized in this study were obtained prior to any surgical procedures. Pathological information for each sample was meticulously obtained through either surgical resection or histopathological examination of tissue sections. Pathological data were carefully recorded for all samples to ensure analytical accuracy.”
Patients and clinical characteristics
Our study enrolled 357 patients, 208 of whom were included in our mass spectrometry (MS) cohort (patients whose preoperative plasma samples were subjected to MS analysis). This cohort consisted of 124 LUSC patients and 84 BLD patients. To identify EV biomarkers associated with LUSC, we analyzed tissue proteomic data from 108 additional LUSC patients from the PDC000234 dataset. Additionally, an ELISA cohort incorporating the plasma samples from 105 LUSC patients and 44 BLD individuals was formed to assess the accuracy of the biomarkers used to create the diagnostic model. Detailed clinical information for all participants in this study is summarized in Table S1.
EV isolation and characterization
The n-LAPE/MS™ kit, incorporating NaY (zeolite NaY), was originally developed by Tianjin Key Laboratory of Clinical Multiomics specifically for the enrichment of low-abundance plasma proteins. In collaboration with our group, we demonstrated for the first time that NaY nanomaterial within this kit can also robustly and selectively enrich EVs from plasma samples. Therefore, validating the efficiency and specificity of NaY compared to conventional ultracentrifugation methods represents an important novel contribution of this study, rather than merely comparing two existing methods. The detailed synthesis method and physicochemical characterization of the zeolite NaY used in this kit have been previously described [8]. Briefly, NaY was synthesized following the molar gel formulation: 1 SiO₂: 0.09 Al₂O₃: 0.37 Na₂O: 16 H₂O. The resulting zeolite exhibited a polyhedral morphology with a particle size of approximately 400 nm, as confirmed by SEM. Structural and compositional analyses performed using EDS, XRD, and FTIR demonstrated good crystallinity and consistent composition, with characteristic peaks corresponding to internal and external tetrahedral linkages of the NaY framework. These physicochemical properties—particularly the uniform particle size and large surface area—enable the effective adsorption of low-abundance proteins and, as newly discovered in our collaborative study, extracellular vesicles (EVs) as well [9, 10].
Initially, 40 µL of plasma sample was diluted with a mixture of 10 µL of NaY and 260 µL of binding buffer. These samples were then incubated at 37 °C for 10 min with agitation at 1,000 rpm. After incubation, the samples were centrifuged at 12,000 × g for 5 min to remove unbound proteins from the supernatant. The enriched NaY was then subjected to three washes with 500 µL of washing buffer, each followed by centrifugation. The NaY solution was resuspended in 50 µL of lysis buffer containing tris (2-carboxyethyl) phosphine (TCEP) and chloroacetamide (CAA) and heated at 95 °C for 10 min with agitation. After cooling to room temperature, 2 µL of trypsin digestion buffer was added, and the mixture was incubated at 37 °C for 4 h with agitation. The peptides were subsequently precipitated with 950 µL of acetonitrile (ACN) and desalted using the SP3 method on NaY. The peptides were eluted with 20 µL of elution buffer, and their concentration was determined using a Nanodrop for MS detection [8].
Furthermore, the concentration and size distribution of EVs were assessed by nanoparticle tracking analysis (NTA) using a NanoSight NS300™ instrument and NTA 3.2 software (Malvern Instruments, UK). The data from the completed video tracks were further analyzed and organized, including correcting the particle concentration for the dilution factor (1:1000).
Additionally, randomly selected EV-enriched suspensions were subjected to transmission electron microscopy (TEM) analysis. Polyenylphosphatidylcholine (PPC)-containing beads were first fixed with 1 mL of 2.5% glutaraldehyde in PBS for 2 h. After three washes with PBS, the particles were resuspended in PBS. A PBS solution with PPC-containing beads (10 µL) was applied onto 200-mesh Formvar carbon-coated copper grids, allowing absorption for 10 min. Excess solution was removed by gently blotting the edge of each grid with filter paper. Next, the sample was negatively stained using a saturated uranyl acetate solution with a 2-minute incubation at room temperature. The residual, unevaporated solution was absorbed using filter paper. The samples were examined using a Hitachi HT7700 transmission electron microscope at 80 kV.
For protein extraction, EVs were prepared as described above. RIPA buffer (C1053, APPLYGEN) was used to extract the proteins. The protein concentration was determined using the BCA Reagent (#23225, Thermo Scientific). Immunoblot analysis was performed with primary antibodies, including anti-CD9 (A19027, ABclonal, 1:1000), anti-TSG101 (A5789, ABclonal, 1:1000), anti-Grp94 (A6272, ABclonal, 1:1000), and anti-CD63 (ARP60760_P050, ABclonal, 1:1000), along with an horseradish peroxidase (HRP)-conjugated goat anti-rabbit secondary antibody. Signals for Western blot analysis were detected using Super Enhanced Chemiluminescent Plus (P1050, APPLYGEN), and the data were quantified using NIH ImageJ software (NIH, Bethesda, MD, USA).
EV MS analysis
For data-independent acquisition (DIA) analysis, iRT (obtained from Biognosys) was uniformly added to each run using a Thermo Scientific U3000 nanoflow LC system coupled with a Q Exactive HF mass spectrometer. Peptide samples dissolved in loading buffer (2% ACN) were separated on a 150 μm ID × 30 cm C18 column (1.9 μm, 120 Å, Dr. Maisch GmbH) with a 150-minute gradient (A: 2% ACN, 0.1% FA; B: 80% ACN, 0.1% FA). The gradient conditions were as follows: 0–5 min, 3–6% B; 6–44 min, 6–90% B; 45–54 min, 90% B; and 55–60 min, 6% B, all at a flow rate of 600 nL/min. The positive ion mode used a 2,000 V spray voltage, and the ion transfer tube temperature was maintained at 270 °C.
Then, a high-resolution MS scan at 60,000 resolution was performed at 200 m/z. The automatic gain control (AGC) target value for the Orbitrap mass analyzer (350–1,500 m/z) was set at 1e6 or 20 ms for the maximum injection time. In MS/MS, the AGC target value was also established at 1e6, and the maximum injection time was automatically determined during high-energy collision dissociation (HCD) fragmentation at a resolution of 30,000 at m/z 200. The normalized collision energy (NCE) was consistently set at 28%.
In proteome DIA MS runs, fragment analysis involved 40 DIA isolation windows, each with widths tailored to data-dependent acquisition (DDA) search outcomes. MS scans were conducted prior to each DIA cycle, ensuring a thorough analysis.
MS database search
The DIA data were analyzed by searching against the human UniProt database, which contains 20,365 sequences. We employed DIA-NN (version 1.8.1) with default settings, a trypsin/P digestion rule, high protein and peptide confidence levels, and a false discovery rate (FDR) of 0.01. This approach ensured rigorous and reliable data analysis.
Quantification and normalization
We screened the samples based on two criteria: samples in which at least 3,000 proteins were identified, and proteins detected in at least 30% of the samples. Subsequently, the protein data were normalized using the minTOP20-fraction of total (FOT) method, wherein the intensity of each protein was normalized as follows: (every-protein intensity)/(sum of all protein intensities in one sample)– (top-20-protein intensity in one sample) × 1,000,000. In cases where values were absent, they were replaced with 0.1 before log2 transformation for data consistency and analysis. This rigorous approach ensured data quality and robust analysis. The MS data and relevant detailed clinical information for these cases can be found in Table S3.
Differentially expressed protein analysis
All data analyses in this study were carried out using R software version 4.1.2. To analyze differential protein expression between LUSC and BLD samples, we employed the limma package (v 3.52.4) with the following stringent criteria: adj.P.Val ≤ 0.05 and|logFC|≥ 1 [11]. Furthermore, we utilized the clusterProfiler package (v 4.4.4) [12] to conduct Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses on the differentially expressed proteins. Gene set variation analysis was conducted with the GSVA package (v 1.44.5) [13]. The org.Hs.eg.db package (v 3.15.0) was used as the annotation database for all these analyses [12].
Diagnostic biomarker selection of the identified plasma EV proteins
We employed the pROC package (v 1.18.0) [14] to calculate the AUC values for a total of 48 plasma EV proteins associated with LUSC and 59 plasma EV proteins associated with BLD.
Establishment of diagnostic models
Diagnostic models were developed using machine learning with the mlr3 package (v 0.16.0) and related resources (https://mlr3.mlrorg.com). The data obtained from the study participants in the discovery cohort were randomly partitioned at a 7:3 ratio. The larger subset (7/10) served as the training set for model construction, while the smaller subset (3/10) was designated the test set. Models established in the discovery cohort (training set) were subsequently validated independently using the test set, and finally confirmed within the entire cohort (all sets combined). The machine learning learners were generated using R packages, which included “mlr3,” “mlr3learners,” and “mlr3extralearners.” Feature selection was conducted using a decision tree. Cross-validation comprised fivefold training/test splits with 15 repetitions.
Initially, multiple common machine learning algorithms, including XGBoost, RF, Lightgbm, SVM and Logistic Regression, were evaluated using fivefold cross-validation. XGBoost consistently outperformed other algorithms in terms of accuracy, sensitivity, specificity, and overall predictive capability; hence, it was selected to construct the diagnostic model (Fig.S1).
Enzyme-linked immunosorbent assay (ELISA)
We employed ELISA to assess the expression of RPS7 and TUBB3 in patient plasma. Following the manufacturer’s instructions, commercial assay kits for RPS7 (MyBioSource, USA, Catalog Number: MBS7204349) and TUBB3 (CUSABIO, China, Catalog Number: CSB-E14121h) were used to determine the expression levels of these proteins in preoperative plasma samples. The ELISA data and the corresponding detailed clinical information can be found in Table S4.
Identification of prognostically relevant EV protein biomarkers
To identify EV proteins associated with overall survival (OS), we conducted time-dependent univariate Cox regression analysis using the ezcox package (v 1.0.4) (https://github.com/ShixiangWang/ezcox). This analysis allowed us to screen proteins that exhibited a significant relationship with OS, providing valuable insights into their prognostic relevance.
Construction and verification of an EV-based risk signature for prognosis prediction
To develop a robust EV-based risk signature with high accuracy and stability, we harnessed the power of 10 machine learning algorithms grouped into 101 algorithm combinations. These integrative algorithms encompassed a diverse set of methods, including random survival forest (RSF), elastic network (Enet), least absolute shrinkage and selection operator (LASSO), Ridge, stepwise Cox, CoxBoost, partial least squares regression for Cox (plsRcox), supervised principal components (SuperPC), generalized boosted regression modeling (GBM), and survival support vector machine (survival-SVM) methods. Our approach commenced by employing these 101 algorithm combinations to identify the set of proteins with the best prognostic relevance. Subsequently, we segregated all LUSC patients into two distinct sets: a training set representing 70% (88) of the patients and a test set encompassing the remaining 30% (36 patients).
Following development and validation of the risk signatures, we calculated a survival risk score for each LUSC patient based on the standardized expression levels of six specific proteins. The randomForestSRC package (v 3.2.2) was used to calculate the risk scores for each LUSC patient, serving as an indicator of their prognosis, with β representing the regression coefficient for each variable. The optimal risk score from the training set was used as the threshold to categorize all included LUSC samples into either a low-risk or high-risk subset [15, 16].
Assessing the prognostic efficacy of the risk signature
Following the development of the risk signatures, it was crucial to assess their reliability and robustness in predicting prognostic outcomes. The validity of the signatures was verified from two key perspectives. First, we utilized Kaplan–Meier curves to compare the OS of the high-risk and low-risk patients in both the training and testing sets. This analysis was conducted using the survival package (v 3.5-3) to visualize the differences in survival outcomes between the risk subsets. To evaluate the predictive power of the risk signature for survival, we generated time-dependent ROC curves, which offer a dynamic assessment of the signature’s performance over time. Additionally, we calculated the AUC to quantitatively evaluate the predictive accuracy and specificity of the risk signature.
Enriched pathways in the high- and low-risk groups
Single-sample gene set enrichment analysis (ssGSEA) was conducted using the GSVA package (v 1.44.5), with the specified method parameter set to ‘ssgsea’ [13]. For this analysis, we used gene sets from the KEGG obtained from The Broad Institute (https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp#C1). From this analysis, we extracted normalized ssGSEA enrichment scores for pathways that exhibited significance, with an FDR threshold set at less than 0.05. This procedure allowed us to identify and focus on pathways that were statistically meaningful in our study.
Results
Description of the workflow
The workflow of our study is illustrated in Fig. 1. Briefly, plasma EVs from 124 patients with LUSC and 84 patients with BLD were enriched using NaY and validated by TEM, NTA, and Western blotting. Proteomic analysis identified LUSC-related EV proteins. Selected proteins were used to construct diagnostic and prognostic models using machine learning algorithms. Finally, we explored the molecular mechanisms underlying prognostic differences by performing pathway enrichment analysis of EV protein profiles (Fig. 1).
Identification of plasma EVs
NaY was employed to enrich EVs from plasma samples of both BLD and LUSC patients [8]; these EVs were then characterized by TEM, NTA and WB. The results of TEM analysis demonstrated that the EVs isolated with NaY exhibited a typical vesicular structure (Fig. 2A). Furthermore, according to NTA, the majority of these isolated EVs were approximately 120 nm in size (Fig. 2B). WB was subsequently performed to further confirm the identity of the EVs, validating the presence of well-established EV-positive biomarkers such as CD63, CD9, and TSG101, while also confirming the absence of negative biomarkers such as GRP94 (Fig. 2C).
Characterization of enriched EVs in plasma and comparison with pure EVs. (A) Transmission electron microscopy images of EVs isolated with NaY from BLD plasma. (B) EV size distribution, particle concentration and mode size in BLD plasma as determined by nanoparticle tracking analysis. (C) Plasma EVs from LUSC patients and BLD patients express EV-positive (CD63, CD9, and TSG101) and EV-negative (GRP94) biomarkers, as indicated by Western blot (WB) analysis. (D) Number of proteins detected in pure EVs and enriched EVs in plasma. (E) The overlap of protein IDs detected in pure EVs and enriched EVs in plasma. (F) Quantitative MS intensities of EV-related biomarkers detected in pure EVs and enriched EVs in plasma
To further validate our findings, we employed the conventional UC method [19] to enrich pure EVs from plasma. Subsequently, MS analysis was conducted, and the results were compared with those obtained from the identification of EVs enriched from plasma using NaY. Our findings revealed that 2982 proteins were detected in pure EVs, while 3331 proteins were identified in plasma-enriched EVs (Fig. 2D). Among the identified proteins, 2610 were common between the two sets of EVs, representing 70.5% of the total protein count. This substantial overlap in protein species between the two enrichment methods underscores a high concordance in the types of proteins detected (Fig. 2E). The lack of detectable EV markers in certain BLD samples (e.g., sample #2) may be attributed to individual biological variability, particularly lower EV abundance in certain plasma samples, or to experimental factors such as differences in EV isolation efficiency. Notably, the quantitative values of the EV-associated biomarkers CD9, CD63, and TSG101 were equivalent, further validating the consistency of the enrichment efficiency of these key exosomal proteins (Fig. 2F). The details of the MS data for the UC and NaY analyses are provided in Table S2.
Expression profiles of EV proteins in LUSC and BLD
To investigate the protein expression profiles of plasma-derived EVs from patients with LUSC and BLD, we employed the methodology described in Sec. EV isolation and characterization. EVs were enriched from both LUSC and BLD samples, and proteomic analysis was subsequently conducted using MS analysis for EV proteinomics. This analysis included 56 samples derived from the plasma of inflammatory granuloma patients and 28 samples from tuberculosis granuloma patients (considered the BLD group), and 124 samples from LUSC patients (considered the LUSC group). Among these samples, a total of 4284 proteins were identified in more than 50%. Differential protein expression patterns were visualized in the form of a volcano plot (Fig. 3A), where 643 proteins exhibited significant differential expression between the two groups (|log2(FC)| > 1, adjusted p value < 0.05). Among these candidate proteins, 429 proteins were upregulated, while 214 were downregulated. Principal component analysis (PCA) further illustrated a clear demarcation between the protein profiles of the LUSC and BLD samples, emphasizing the substantial disparities in the systemic blood EV proteomic landscape of LUSC patients (Fig. 3B).
Molecular signatures of plasma-derived EVs from LUSC and BLD patients. (A) Volcano plot of identified proteins in LUSC vs. BLD. Proteins with significantly increased abundance are colored in red, and proteins with lower abundance are colored in blue. (B) Principal component analysis for BLD and LUSC samples. (C) GO pathway enrichment results for LUSC. (D) GO pathway enrichment results for BLD. (E) KEGG pathway analysis for LUSC. (F) KEGG pathway analysis for BLD
To gain deeper insights into the biological characteristics of LUSC, we conducted a proteomic-level analysis to evaluate the molecular distinctions between plasma-derived EV samples from LUSC and BLD patients. GO analysis revealed a range of significantly enriched biological processes in the LUSC group, including Golgi vesicle transport, phospholipid metabolic processes, and sulfur compound metabolic processes (Fig. 3C). Conversely, the BLD group exhibited enrichment in processes related to skin development, keratinization, keratinocyte differentiation, and epidermis development (Fig. 3D). In the cellular component analysis, the secretory granule lumen, cytoplasmic vesicle lumen, mitochondrial inner membrane, ribosomal subunit, mitochondrial matrix, and organellar ribosome cellular components were enriched in the proteins of the LUSC group (Fig. 3C). In contrast, in the BLD group, components such as collagen-containing extracellular matrix, keratin filaments, and intermediate filament cytoskeleton were enriched in its proteins (Fig. 3D). Molecular function analysis revealed that in the LUSC group, GTPase activity, phosphatase activator activity, GTP binding, structural constituents of ribosomes, and protein phosphatase regulator activity were enriched in its proteins (Fig. 3C), while the BLD group displayed enrichment related to serine-type endopeptidase inhibitor activity, structural constituents of the skin epidermis, structural constituents of the cytoskeleton, and haptoglobin binding (Fig. 3D). These results suggest that, in LUSC patients, the primary biological processes are centered on substance metabolism and transport (Fig. 3C). In contrast, BLD patients exhibit a primary focus on processes related to granuloma formation (Fig. 3D).
In addition, KEGG pathway analysis demonstrated that several metabolism-related pathways, including glycerophospholipid metabolism, glycolysis/gluconeogenesis, and fructose and mannose metabolism, were significantly enriched in the proteins of the LUSC group (Fig. 3E). Conversely, pathways such as complement and coagulation cascades and ECM-receptor interactions were enriched in the proteins of the BLD group (Fig. 3F). These findings provide valuable insights into the distinct molecular characteristics of LUSC and BLD, shedding light on the underlying biological processes and pathways associated with these conditions.
Establishment and validation of an LUSC early diagnosis model
To precisely identify protein biomarkers associated with LUSC within plasma EVs, we conducted differential analysis of proteomic data from 108 pairs of LUSC tissue samples to identify proteins exhibiting differential expression at both the plasma and tissue levels, emphasizing the importance of discovering biomarkers with consistent expression patterns in both contexts. The proteomic data from the 108 pairs of tissue samples were sourced from the PDC000234 dataset [17]. This comprehensive analysis revealed 2103 differentially expressed proteins; of these, the levels of 1309 were increased and those of 1677 were decreased in tumor tissues relative to normal tissues. Upon further examination, we identified 38 proteins whose levels were significantly increased both in tumor tissues and in the plasma EVs of LUSC patients (Fig. 4A), while the levels of 63 proteins were decreased in a similar manner (Fig. 4B). To assess the diagnostic potential of individual candidate biomarkers, we investigated the performance of 101 EV proteins (Fig. 4C, Fig. S2).
Identification of tumor-associated EV signatures and proteins capable of differentiating LUSC from BLD. (A) Venn diagrams showing the number of proteins with significantly increased levels in both tumor tissue and LUSC EVs (NAT: normal adjacent tissue). (B) Venn diagrams showing the number of proteins with significantly decreased levels in both tumor tissue and LUSC-derived EVs. (C) Heatmap and diagnostic value of EV proteins altered in patients with LUSC relative to those with BLD
After identifying proteins associated with LUSC, we selected five based on their functional relevance and strong diagnostic capabilities. These proteins were then employed to establish an early diagnostic model for discriminating between LUSC and BLD samples. To assess the diagnostic efficacy of this model, we selected a cohort comprising 80 patients with stage I/II LUSC and 84 patients with BLD for model establishment and validation. The BLD group included 56 samples from individuals with pneumonia granuloma and 28 samples from patients with pulmonary tuberculosis granuloma. To create a robust predictive model, we employed XGBoost classification with the mlr3 package, as this gradient boosting approach is known for its strong predictive capabilities. Our aim was to identify a subset of EV proteins that could accurately discriminate between LUSC and patients with BLD (Fig. 5A). We randomly divided the samples into a training set (70%) and an independent test set (30%) based on sample type, ensuring that both control and tumor samples were represented in each group.
Plasma EV protein biomarkers for the diagnosis of LUSC. (A) Proteins with the highest predictive values in classifying LUSC and BLD with XGBoost. (B) ROC curve and confusion matrix in the training set. (C) ROC curve and confusion matrix in the test set. (D) ROC curve and confusion matrix in the entire cohort. SEN (Sensitivity) and SPE (Specificity) refer to diagnostic performance metrics for distinguishing LUSC from BLD samples. (E) Levels of TUBB3 in total plasma from patients with LUSC and BLD. A total of 74 samples (38 BLD and 36 LUSC) were tested. (F) Levels of RPS7 in total plasma from patients with LUSC and BLD. A total of 149 samples (44 BLD and 105 LUSC) were tested
Through the application of 10-fold cross-validation in the training set, we discovered that a subset of 5 EV proteins achieved remarkable performance, with a sensitivity (true positive rate) of 100% and a specificity (true negative rate) of 98%, along with an impressive AUC of 1.000 (95% CI: 0.998-1.000) (Fig. 5B). When the model was applied to the independent test set, it maintained exceptional performance, yielding a sensitivity of 100% and a specificity of 96%, with an AUC of 0.995 (95% CI: 0.980-1.000) (Fig. 5C). Additionally, when the model was applied to all samples, it maintained a sensitivity of 100% and a specificity of 98% (95% CI: 0.995-1.000) (Fig. 5D). These results conclusively demonstrate that our XGBoost classifier effectively predicts LUSC and BLD in both the training set and the independent test set. This suggests the strong potential of the model in distinguishing between these conditions, underscoring its clinical significance.
To validate the accuracy of the diagnostic model, we conducted additional verification using ELISA to confirm the presence of two EV protein biomarkers—TUBB3 and RPS7—identified in patients with BLD and LUSC. These two proteins were selected for further validation based on their high feature importance scores in the XGBoost diagnostic model and their biological relevance, while effective detection of EV-specific concentrations for RPLP1, KRT2, and VTN was not feasible due to current limitations in commercially available ELISA kits. ELISA results revealed a significant increase in the plasma level of TUBB3 in LUSC patients compared to BLD patients (P = 0.012, Wilcoxon test; Fig. 5E), and similarly, a higher level of RPS7 in LUSC patients than in BLD patients (P = 6.4e–05, Wilcoxon test; Fig. 5F). These findings underscore the utility of experimental validation in confirming the reliability of plasma EV protein biomarkers and establish a robust foundation for model development.
Establishing a prognostic prediction model for LUSC
To utilize the expression profiles of EV proteins, we conducted univariate Cox analysis and identified 18 prognostic proteins (Fig. S2). Subsequently, we selected 12 of these proteins associated with patient prognosis for integration via machine learning in the establishment of a protein signature. For all samples from LUSC, we employed 101 combinations of diverse machine-learning algorithms and calculated the C-index for each model. Intriguingly, the optimal model involved a combination of the LASSO and RSF algorithms, which achieved the highest C-index of 0.950 (Fig. 6A). In the LASSO regression, the optimal λ was determined by minimizing the partial likelihood deviance based on the algorithms (Fig. 6B). Nine proteins with nonzero coefficients from the LASSO regression were then subjected to RSF analysis. Ultimately, the top six proteins (DPYD, GALK1, CDC23, UBE2L3, RHEB, and PSME1) according to their importance were utilized to construct the prognostic model (Fig. 6C).
EV models were developed and validated via a machine learning-based integrative procedure. (A) A total of 101 prediction models were developed. (B) In the LUSC cohort, the optimal λ was determined when the partial likelihood deviance reached the minimum value, and the LASSO coefficients of the most useful prognostic proteins were calculated. (C) Importances of the 9 proteins obtained via random survival forest analysis. (D) Determining the optimal cutoff value for the high- and low-risk groups. (E) K‒M curves of OS according to the 6-gene signature for the LUSC training group (70%). (F) K‒M curves of OS according to the 6-gene signature for the LUSC validation group (30%). (G) ROC curve and AUC for the 6-gene signature classification in the training group (70%). (H) ROC curve and AUC for the 6-gene signature classification in the validation group (30%)
Next, we partitioned the cohort of 124 patients into distinct training and validation groups through a random allocation process. In the training group, individual patient risk scores were computed by weighing the expression levels of the six identified proteins according to the training dataset. Subsequently, employing the survminer package, we categorized patients in the training group into high-risk and low-risk groups using the optimal cutoff value (Fig. 6D). The outcomes revealed a significantly lower OS in the high-risk group than in the low-risk group in both the training and validation sets (Fig. 6E and F). Additionally, ROC curve analysis revealed AUC values for 1-, 3-, and 5-year OS of 0.795, 0.909, and 0.901, respectively, in the training group and of 0.7939, 0.833, and 0.775, respectively, in the validation group (Fig. 6G and H). These results collectively suggest that the prognostic model exhibits robust validation efficiency.
Molecular biological characteristics of the high- and low-risk groups
To investigate the biological differences in the pathways enriched in the high-risk and low-risk groups, we employed ssGSEA to assess pathway enrichment in both groups. Our analysis revealed that the low-risk group was primarily associated with immune and metabolic pathways, including fatty acid metabolism, the intestinal immune network for IgA production, antigen processing and presentation, and complement and coagulation cascades. On the other hand, the high-risk group displayed significant associations with pathways related to cell growth and proliferation, including the mTOR signaling pathway, ErbB signaling pathway, MAPK signaling pathway, VEGF signaling pathway, JAK-STAT signaling pathway, and cell cycle (Fig. 7A). These findings suggest that the low-risk group is more likely to have BLD, while the high-risk group exhibits more malignant characteristics.
Subtyping of LUSC and associations with clinical outcomes. (A) ssGSEA (C2: curated gene sets, KEGG subset of canonical pathways) revealed the pathways that were significantly enriched in the proteomic subtypes. (B) Proteins in which complement and coagulation cascades pathways were enriched that were differentially expressed between the two proteomic subtypes. (C) Proteins in which non-small cell lung cancer–related pathways were enriched that were differentially expressed between the two proteomic subtypes. (D) The association between risk stratification and clinical information of the patients. Fisher’s exact test was used for categorical variables: sex, differentiation status, smoking status, and TNM stage
Within the complement and coagulation cascades pathway of the low-risk group, the expression levels of proteins such as C1QA, C1QB, C1QC, C6, C7, and C8 were significantly greater than those in the high-risk group (Fig. 7B). These proteins collectively constitute the membrane attack complex, which exerts cytotoxic effects on tumor cells. This observation may be associated with the better prognosis observed in the low-risk group. Within the non-small cell lung cancer pathway in the high-risk group, proteins such as PIK3R1, AKT1, AKT2, MAP2K2, and PDPK1 were expressed at higher levels than in the low-risk group (Fig. 7C). These proteins are known to be associated with cell growth and proliferation, suggesting a potential link to the poorer prognosis observed in the high-risk group. Consistent with current clinical knowledge, early-stage cancer was more common in patients in the low-risk group, while patients in the high-risk group had more advanced-stage disease (Fig. 7D).
Discussion
The accurate, noninvasive, and early diagnosis of LUSC remains a significant challenge, especially for patients with indeterminate nodules detected on imaging. LDCT is a common method for lung cancer screening; studies indicate that its imaging features can differentiate between malignant and benign nodules with sensitivities ranging from 76.2 to 92.85% and specificities ranging from 72.73 to 96.1% [18]. However, in the early detection of cancer, serum biomarkers such as SCCA and Cyfra 21 − 1 exhibit lower sensitivity and specificity, failing to meet clinical demands [19]. These limitations significantly impact the diagnosis and prognosis of LUSC patients. Research on novel diagnostic methods and biomarkers is crucial for improving the accuracy and clinical efficacy of lung cancer screening. Our proposed diagnostic model has the potential to improve early detection by clearly distinguishing benign lesions from malignant ones, thus potentially reducing unnecessary invasive procedures following LDCT screening.
Liquid biopsy is a noninvasive technique that is increasingly supported by evidence for its effectiveness in the screening and continuous monitoring of LUSC. Compared to traditional tissue biopsies, liquid biopsy is minimally invasive, allowing for the collection of consecutive blood samples over time to monitor cancer progression in real time. EVs secreted by cancer cells can be easily isolated from various body fluids and contain cancer-specific contents. Increasing evidence suggests that EVs can carry various factors that have been validated as alternative biomarkers for cancer progression and metastasis. This indicates that the analysis of circulating EVs could replace some invasive clinical procedures currently used for cancer diagnosis, prognosis, and prediction. In NSCLC, there is a growing interest in the potential of EV biomarker extraction as a less invasive alternative to tissue biopsy for early diagnosis and prognosis. Some research groups have conducted studies to explore the potential of detecting EV proteins for diagnosing NSCLC [20]. However, in the field of distinguishing lung squamous cell carcinoma from benign lung nodules, research is still in its early stages.
In this study, we employed a novel nanomaterial, NaY, to enrich plasma-derived EVs and identified specific EV protein biomarkers through proteomic analysis for precisely diagnosing LUSC. Importantly, we validated these findings in the plasma of independent samples using ELISA, confirming the detectability of these biomarkers in the original biological fluid and ensuring their potential clinical applicability. Furthermore, plasma-derived EVs not only enable accurate diagnostic identification but also offer predictive insights into the clinical prognosis of LUSC patients. This highlights the broader application prospects of EVs enriched by NaY.
Our study makes a significant contribution to the identification of EV biomarkers associated with LUSC. It is crucial to note that our study subjects also underwent pathological biopsy, enhancing the reliability of our analysis. Our findings demonstrate the effective enrichment of plasma EVs using the novel nanomaterial NaY, which is devoid of contaminants. Through GO enrichment analysis, we confirmed that the vesicular structures we enriched were indeed extracellular vesicles. Furthermore, KEGG analysis revealed that in LUSC patients, pathways such as glycolysis/gluconeogenesis and biosynthesis of amino acids are enriched in the proteins of the plasma EVs, aligning with the tissue characteristics of LUSC. This suggests that LUSC releases extracellular vesicles into the circulatory system that influence normal physiological functions and create an ecosystem favorable for cancer progression [21].
Through matching with differentially expressed data in tissues, we identified 38 EV biomarkers associated with LUSC that demonstrated high diagnostic accuracy. From these, we selected five protein biomarkers and established a diagnostic model using machine learning algorithms. Recent research has highlighted the overexpression of TUBB3 in solid tumors such as breast cancer, ovarian cancer, testicular cancer, and colorectal cancer [22, 23]. Moreover, TUBB3 overexpression serves as a biomarker for resistance to taxanes and vinca alkaloids and is associated with poor prognosis in various epithelial tumors, including NSCLC [24,25,26]. The upregulation of RPS7 is correlated with poorer recurrence-free survival and OS rates in prostate cancer patients, significantly enhancing prostate cancer cell growth and promoting cell migration by regulating the epithelial–mesenchymal transition [27, 28]. As reported by Artero-Castro and colleagues, RPLP1 regulates cell proliferation and has been implicated in promoting cell survival and migration in experiments involving liver cancer, cervical cancer, and breast cancer cells [29,30,31,32]. In uterine adenomyosis and endometrial cancer, RPLP1 overexpression serves as a diagnostic biomarker for less invasive lesions [33]. Consequently, we identified TUBB3, RPS7, and RPLP1 as diagnostic biomarkers for LUSC. KRT2, which is associated with keratinocyte activation, proliferation, and keratinization, and VTN, a major component of the extracellular matrix, were selected as diagnostic biomarkers for BLD [34, 35]. The diagnostic model established with these five biomarkers demonstrated excellent and robust diagnostic performance. Validation through ELISA experiments in an independent plasma cohort confirmed the expression levels of TUBB3 and RPS7, consistent with our MS results, underscoring the clinical value of our diagnostic biomarkers.
Furthermore, the proteins present in EVs not only enable precise diagnosis but also serve as indicators for assessing the survival status of patients with LUSC. Initially, a series of proteins were selected using a univariate Cox proportional hazards model. Subsequently, we fitted 101 models to the LUSC dataset, ultimately identifying the optimal model as that built from the combination of the LASSO and RSF algorithms. Ultimately, we selected six proteins, namely, DPYD, GALK1, CDC23, UBE2L3, RHEB, and PSME1, and constructed a prognostic model to calculate the risk score for each sample.
The DPYD gene encodes dihydropyrimidine dehydrogenase (DPD). Mutations within DPYD can result in diminished DPD activity, impacting the body’s ability to metabolize pyrimidine nucleosides. This alteration increases the risk of heightened toxicity following the administration of fluoropyrimidines [36]. GALK1 is a key enzyme in galactose metabolism that catalyzes the transfer of phosphate from ATP to galactose and participates in the first step of galactose breakdown [37]. CDC23 (also known as APC8) is a subunit of the APC complex that regulates mitosis by catalyzing the formation of ubiquitin chains [38, 39]. By modulating the expression of EMT biomarkers, CDC23 ultimately regulates malignant biological behavior in liver cancer cells, representing a novel target for further research on the growth and metastasis of liver cancer [40]. The UBE2L3 protein is a member of the E2 ubiquitin-conjugating enzyme family [41]. UBE2L3 promotes invasion and metastasis in lung adenocarcinoma through the GSK-3β/Snail signaling pathway and enhances proliferation and migration in liver cancer through the negative regulation of CDKN2B and CLDN1 [42, 43]. RHEB, a member of the Ras GTPase superfamily, encodes a protein linking growth factor signaling to mTORC1 activation [44]. A meta-analysis of published cancer cell genetics and transcriptome databases revealed that RHEB is overexpressed in various human cancers, inducing the formation of prostate tumors and lymphomas [45,46,47]. Elevated expression of PSME1 in gastric cancer patients is positively correlated with favorable OS, progression-free survival (PFS), and pathological survival (PS). It is also positively associated with increased infiltration of various immune cells and activation of steps within the anticancer immune cycle [48]. The integration of various machine learning algorithms and their combinations in fitting the LUSC prognostic model is advantageous for improving diagnostic accuracy, and combining algorithms further reduced the dimensionality of the variables, simplifying the model and potentially translating it into valuable clinical applications.
In our study, we observed that in patients with LUSC, pathways associated with non-small cell lung cancer were primarily enriched in the proteins of the EVs in the high-risk group, reflecting the proliferative characteristics of the tumor. Conversely, in EVs in the low-risk group, pathways related to complement and coagulation cascades were enriched in their proteins, exhibiting characteristics similar to those observed in BLD patients. This risk stratification correlates with the clinical stage of patients, emphasizing the potential role of EVs in revealing information about tumor cell behavior. This discovery not only aids in understanding the associations between different clinical presentations in LUSC patients but also provides a crucial foundation for future therapeutic strategies and personalized medicine.
Although the field of EV proteomics in lung cancer research has been well explored, it continues to present various challenges. However, our study has effectively addressed these challenges through innovative approaches. (1) Pioneering EV enrichment: This approach requires only 40 µl of plasma, making it highly efficient with a minimal sample volume. Additionally, the need for UC is eliminated, and the binding process is completed within 10 min through incubation with nanomaterials. In contrast, traditional UC methods are time-consuming, are less stable, and necessitate a plasma volume more than 250 times that of the presented method. (2) Diagnostic breakthrough: Our research has yielded significant results by identifying EV-related biomarkers specific for LUSC and developing a noninvasive diagnostic model with exceptional accuracy. This model offers a highly effective means of distinguishing LUSC from BLD. (3) Prognostic insights: By leveraging the expression levels of six key proteins within the EVs from the plasma of LUSC patients, our study has contributed to the prediction of patient prognosis, shedding light on their clinical outcomes. (4) Revealing intercellular communication: Our findings provide compelling evidence of the ability of EVs, which are rich in protein, to transmit vital information and participate promptly in intercellular communication. This phenomenon is intricately linked to the invasiveness and metastatic properties of tumors.
We acknowledge that our study has limitations. We acknowledge that our study has limitations. The LUSC and BLD cohorts were not fully matched with respect to key demographic variables such as age, gender, and smoking status. For example, in both the MS and ELISA cohorts, the LUSC group had a higher proportion of males and smokers compared to the BLD group. These differences may have influenced the distribution of EV protein expression and introduced potential confounding factors that could affect the diagnostic model’s accuracy. As such, the observed performance metrics should be interpreted with caution, and future studies involving demographically balanced cohorts are warranted to further validate our findings. We recognize that our ELISA validation measured total plasma protein concentrations, including both EV-bound and soluble forms of the biomarkers. Consequently, the diagnostic efficacy demonstrated by our EV-based protein signature may not fully translate directly to measurements performed on plasma alone. Future research will require developing advanced methodologies or specialized assays capable of specifically quantifying EV-derived proteins separately from free circulating forms to enhance the clinical applicability of EV biomarkers. It is noteworthy that the EV-negative marker GRP94 exhibited relatively high abundance in both UC and NaY groups, with even higher levels in the NaY-enriched samples (Table S2). This may suggest that NaY-based enrichment, while highly efficient, may also co-isolate certain non-EV proteins due to nonspecific interactions. Moreover, the intrinsic abundance of GRP94 in plasma may further contribute to its presence in EV isolates. In addition, it is possible that the enrichment of low-abundance EV-associated proteins by NaY may have led to a relative increase in the detectable abundance of non-EV markers such as GRP94, especially when using bulk proteomic quantification approaches. These findings highlight the importance of optimizing EV isolation protocols to minimize contamination and ensure higher EV purity in future studies.In summary, our research not only overcomes existing challenges but also introduces innovative methods and diagnostic tools that hold promise for improving the diagnosis, prognosis, and understanding of the underlying mechanisms of LUSC [49, 50].
Conclusion
We discovered that the novel nanomaterial NaY can effectively enrich EVs from plasma. We explored plasma EVs protein biomarkers associated with LUSC, establishing a diagnostic model capable of distinguishing between benign and malignant lung nodules in patients. Additionally, we developed a prognostic model that accurately predicts patient outcomes.
Data availability
The MS proteomics data have been deposited in the ProteomeXchange Consortium (https://proteomecentral.proteomexchange.org) via the iProX partner repository with the dataset identifier PXD047893.
Abbreviations
- LUSC:
-
Lung squamous cell carcinoma
- EVs:
-
Extracellular vesicles
- BLD:
-
Benign lung disease
- NSCLC:
-
Non-small cell lung cancer
- LDCT:
-
Low-dose computed tomography
- SCCA:
-
Squamous cell carcinoma antigen
- CEA:
-
Carcinoembryonic antigen
- NSE:
-
Neuron-specific enolase
- NTA:
-
Nanoparticle tracking
- TEM:
-
Transmission electron microscopy
- NCE:
-
Normalized collision energy
- FDR:
-
False discovery rate
- GO:
-
Gene Ontology
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- ssGSEA:
-
Single-sample gene set enrichment analysis
- AUC:
-
Area under the curve
- ELISAs:
-
Enzyme-linked immunosorbent assays
- OS:
-
Overall survival
- RSF:
-
Random survival forest
- Enet:
-
Elastic network
- SuperPC:
-
Supervised principal components
- GBM:
-
Generalized boosted regression modeling
- Survival-SVM:
-
Survival support vector machine
- UC:
-
Ultracentrifugation
References
Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J Clin. 2021;71(3):209–49.
Socinski MA, Obasaju C, Gandara D, et al. Current and emergent therapy options for advanced squamous cell lung Cancer. J Thorac Oncol Feb. 2018;13(2):165–83. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jtho.2017.11.111.
Ang YL, Tan HL, Soo RA. Best practice in the treatment of advanced squamous cell lung cancer. Ther Adv Respir Dis Oct. 2015;9(5):224–35. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/1753465815581147.
Sestini S, Boeri M, Marchianò A et al. [Lung cancer screening in high-risk subjects: early detection with LDCT and risk stratification using miRNA-based blood test]. Epidemiol Prev. Jan-Feb. 2016;40(1 Suppl 1):42–50. Screening per il tumore polmonare in soggetti ad alto rischio: diagnosi precoce con TC spirale associata a stratificazione del rischio con miRNA circolanti. https://doiorg.publicaciones.saludcastillayleon.es/10.19191/ep16.1s1.P042.029
Song WA, Liu X, Tian XD, et al. Utility of squamous cell carcinoma antigen, carcinoembryonic antigen, Cyfra 21– 1 and neuron specific enolase in lung cancer diagnosis: a prospective study from China. Chin Med J (Engl) Oct. 2011;124(20):3244–8.
Liu YJ, Wang C. A review of the regulatory mechanisms of extracellular vesicles-mediated intercellular communication. Cell Commun Signal Apr. 2023;13(1):77. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12964-023-01103-6.
Kalluri R, McAndrews KM. The role of extracellular vesicles in cancer. Cell. Apr 13. 2023;186(8):1610–1626. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.cell.2023.03.010
Ma C, Li Y, Li J, et al. Comprehensive and deep profiling of the plasma proteome with protein Corona on zeolite NaY. J Pharm Anal May. 2023;13(5):503–13. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jpha.2023.04.002.
Chen, Chen W-Y, Tsai P-J, Chien K-Y, Yu J-S, Chen Y-C. Rapid enrichment of phosphopeptides and phosphoproteins from complex samples using magnetic particles coated with alumina as the concentrating probes for MALDI MS analysis. J Proteome Res. 2007;6(1):316–25.
Yao J, Sun N, Deng C. Recent advances in mesoporous materials for sample Preparation in proteomics research. TRAC Trends Anal Chem. 2018;99:88–100.
Ritchie ME, Phipson B, Wu D, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res Apr. 2015;20(7):e47. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkv007.
Yu G, Wang LG, Han Y, He QY. ClusterProfiler: an R package for comparing biological themes among gene clusters. Omics May. 2012;16(5):284–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1089/omi.2011.0118.
Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinf Jan. 2013;16:14:7. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-14-7.
Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinf Mar. 2011;17:12:77. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-12-77.
Liu Z, Lu T, Wang Y, et al. Establishment and experimental validation of an immune MiRNA signature for assessing prognosis and immune landscape of patients with colorectal cancer. J Cell Mol Med Jul. 2021;25(14):6874–86. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/jcmm.16696.
Salazar R, Tabernero J. New approaches but the same flaws in the search for prognostic signatures. Clin Cancer Res Apr. 2014;15(8):2019–22. https://doiorg.publicaciones.saludcastillayleon.es/10.1158/1078-0432.Ccr-14-0219.
Satpathy S, Krug K, Jean Beltran PM, et al. A proteogenomic portrait of lung squamous cell carcinoma. Cell Aug. 2021;5(16):4348–e437140. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.cell.2021.07.016.
Wu YJ, Wu FZ, Yang SC, Tang EK, Liang CH. Radiomics in early lung Cancer diagnosis: from diagnosis to clinical decision support and education. Diagnostics (Basel) Apr. 2022;24(5). https://doiorg.publicaciones.saludcastillayleon.es/10.3390/diagnostics12051064.
Zhao Y, Liu Y, Li S, et al. Role of lung and gut microbiota on lung cancer pathogenesis. J Cancer Res Clin Oncol Aug. 2021;147(8):2177–86. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00432-021-03644-0.
Sandfeld-Paulsen B, Aggerholm-Pedersen N, Bæk R, et al. Exosomal proteins as prognostic biomarkers in non-small cell lung cancer. Mol Oncol Dec. 2016;10(10):1595–602. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.molonc.2016.10.003.
Man J, Zhang X, Dong H, et al. Screening and identification of key biomarkers in lung squamous cell carcinoma by bioinformatics analysis. Oncol Lett Nov. 2019;18(5):5185–96. https://doiorg.publicaciones.saludcastillayleon.es/10.3892/ol.2019.10873.
Sobierajska K, Wieczorek K, Ciszewski WM, et al. β-III tubulin modulates the behavior of snail overexpressed during the epithelial-to-mesenchymal transition in colon cancer cells. Biochim Biophys Acta Sep. 2016;1863(9):2221–33. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.bbamcr.2016.05.008.
Kavallaris M, Burkhart CA, Horwitz SB. Antisense oligonucleotides to class III beta-tubulin sensitize drug-resistant cells to taxol. Br J Cancer Jun. 1999;80(7):1020–5. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/sj.bjc.6690507.
Ferrandina G, Zannoni GF, Martinelli E, et al. Class III beta-tubulin overexpression is a marker of poor clinical outcome in advanced ovarian cancer patients. Clin Cancer Res May. 2006;1(9):2774–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1158/1078-0432.Ccr-05-2715.
Sève P, Isaac S, Trédan O, et al. Expression of class III {beta}-tubulin is predictive of patient outcome in patients with non-small cell lung cancer receiving vinorelbine-based chemotherapy. Clin Cancer Res Aug. 2005;1(15):5481–6. https://doiorg.publicaciones.saludcastillayleon.es/10.1158/1078-0432.Ccr-05-0285.
Leandro-García LJ, Leskelä S, Landa I, et al. Tumoral and tissue-specific expression of the major human beta-tubulin isotypes. Cytoskeleton (Hoboken) Apr. 2010;67(4):214–23. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/cm.20436.
Zhang C, Qie Y, Yang T, et al. Kinase PIM1 promotes prostate cancer cell growth via c-Myc-RPS7-driven ribosomal stress. Carcinog Mar. 2019;12(1):52–60. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/carcin/bgy126.
Wen Y, An Z, Qiao B, Zhang C, Zhang Z. RPS7 promotes cell migration through targeting epithelial-mesenchymal transition in prostate cancer. Urol Oncol May. 2019;37(5):297e. 1-297.e7.
Artero-Castro A, Kondoh H, Fernández-Marcos PJ, Serrano M, Ramón y Cajal S, Lleonart ME. Rplp1 bypasses replicative senescence and contributes to transformation. Exp Cell Res May. 2009;1(8):1372–83. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.yexcr.2009.02.007.
Xie C, Cao K, Peng D, Qin L. RPLP1 is highly expressed in hepatocellular carcinoma tissues and promotes proliferation, invasion and migration of human hepatocellular carcinoma Hep3b cells. Exp Ther Med Jul. 2021;22(1):752. https://doiorg.publicaciones.saludcastillayleon.es/10.3892/etm.2021.10184.
Xia L, Yue Y, Li M, et al. CNN3 acts as a potential oncogene in cervical cancer by affecting RPLP1 mRNA expression. Sci Rep Feb. 2020;12(1):2427. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-020-58947-y.
He Z, Xu Q, Wang X, et al. RPLP1 promotes tumor metastasis and is associated with a poor prognosis in triple-negative breast cancer patients. Cancer Cell Int. 2018;18:170. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12935-018-0658-0.
Peterson R, Minchella P, Cui W, Graham A, Nothnick WB. RPLP1 is Up-Regulated in human adenomyosis and endometrial adenocarcinoma epithelial cells and is essential for cell survival and migration in vitro. Int J Mol Sci Jan. 2023;31(3). https://doiorg.publicaciones.saludcastillayleon.es/10.3390/ijms24032690.
Bloor BK, Tidman N, Leigh IM, et al. Expression of keratin K2e in cutaneous and oral lesions: association with keratinocyte activation, proliferation, and keratinization. Am J Pathol. Mar 2003;162(3):963–75. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/s0002-9440(10)63891-6.
Peng Y, Li L, Shang J, et al. Macrophage promotes fibroblast activation and kidney fibrosis by assembling a vitronectin-enriched microenvironment. Theranostics. 2023;13(11):3897–913. https://doiorg.publicaciones.saludcastillayleon.es/10.7150/thno.85250.
Diasio RB, Offer SM. Testing for dihydropyrimidine dehydrogenase deficiency to individualize 5-Fluorouracil therapy. Cancers (Basel) Jun. 2022;30(13). https://doiorg.publicaciones.saludcastillayleon.es/10.3390/cancers14133207.
Davalieva K, Kiprijanovska S, Ivanovski O, et al. Proteomics profiling of bladder Cancer tissues from early to advanced stages reveals NNMT and GALK1 as biomarkers for early detection and prognosis of BCa. Int J Mol Sci Oct. 2023;6(19). https://doiorg.publicaciones.saludcastillayleon.es/10.3390/ijms241914938.
Prinz S, Hwang ES, Visintin R, Amon A. The regulation of Cdc20 proteolysis reveals a role for APC components Cdc23 and Cdc27 during S phase and early mitosis. Curr Biol Jun. 1998;18(13):750–60. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/s0960-9822(98)70298-2.
Hershko A. Mechanisms and regulation of the degradation of cyclin B. Philos Trans R Soc Lond B Biol Sci. Sep 29. 1999;354(1389):1571-5; discussion 1575-6. https://doiorg.publicaciones.saludcastillayleon.es/10.1098/rstb.1999.0500
Zhang Y, Luo L, Fu C, Hu W, Li Y, Xiong J. CDC23 knockdown suppresses the proliferation, migration and invasion of liver cancer via the EMT process. Oncol Lett Jul. 2023;26(1):291. https://doiorg.publicaciones.saludcastillayleon.es/10.3892/ol.2023.13877.
Peris-Moreno D, Malige M, Claustre A, et al. UBE2L3, a partner of MuRF1/TRIM63, is involved in the degradation of myofibrillar actin and myosin. Cells Aug. 2021;3(8). https://doiorg.publicaciones.saludcastillayleon.es/10.3390/cells10081974.
Ma X, Qi W, Yang F, Pan H. UBE2L3 promotes lung adenocarcinoma invasion and metastasis through the GSK-3β/Snail signaling pathway. Am J Transl Res. 2022;14(7):4549–61.
Liu Y, Song C, Ni H, et al. UBE2L3, a susceptibility gene that plays oncogenic role in hepatitis B-related hepatocellular carcinoma. J Viral Hepat Nov. 2018;25(11):1363–71. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/jvh.12963.
Castro AF, Rebhun JF, Clark GJ, Quilliam LA. Rheb binds tuberous sclerosis complex 2 (TSC2) and promotes S6 kinase activation in a rapamycin- and farnesylation-dependent manner. J Biol Chem Aug. 2003;29(35):32493–6. https://doiorg.publicaciones.saludcastillayleon.es/10.1074/jbc.C300226200.
Lu KH, Wu W, Dave B, et al. Loss of tuberous sclerosis complex-2 function and activation of mammalian target of Rapamycin signaling in endometrial carcinoma. Clin Cancer Res May. 2008;1(9):2543–50. https://doiorg.publicaciones.saludcastillayleon.es/10.1158/1078-0432.Ccr-07-0321.
Mavrakis KJ, Zhu H, Silva RL, et al. Tumorigenic activity and therapeutic Inhibition of rheb GTPase. Genes Dev Aug. 2008;15(16):2178–88. https://doiorg.publicaciones.saludcastillayleon.es/10.1101/gad.1690808.
Nardella C, Chen Z, Salmena L, et al. Aberrant Rheb-mediated mTORC1 activation and Pten haploinsufficiency are cooperative oncogenic events. Genes Dev Aug. 2008;15(16):2172–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1101/gad.1699608.
Guo Y, Dong X, Jin J, He Y. The expression patterns and prognostic value of the proteasome activator subunit gene family in gastric Cancer based on integrated analysis. Front Cell Dev Biol. 2021;9:663001. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fcell.2021.663001.
Ma J, Chen T, Wu S, et al. iProX: an integrated proteome resource. Nucleic Acids Res Jan. 2019;8(D1):D1211–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gky869.
Chen T, Ma J, Liu Y, et al. iProX in 2021: connecting proteomics data sharing with big data. Nucleic Acids Res Jan. 2022;7(D1):D1522–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkab1081.
Acknowledgements
We would like to thank Dr. Kai Zhang for his professional help in data analysis. The authors would like to thank all of the patients and their families for their support of this study. Figure 1 was produced with BioRender.
Funding
This work was supported by the National Key R&D Program of China (No. 2022YFF0705004, No. 2022YFE0103600), the National Natural Science Foundation of China (No. 82273120) and the Tianjin Key Laboratory of Clinical Multiomics (PTSWKL882304016).
Author information
Authors and Affiliations
Contributions
TX, YTL and JL conceived and designed the study. SM, NZ and CCM performed the experiments. SM, NA and XD analysed the data and drafted the manuscript. TX and JL secured financing of the study. YRW, LS, RQZ, XCZ and SJC contributed to the review and editing. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
This research was approved by the Ethics Committee of the Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ma, S., Zhao, N., Dong, X. et al. Liquid biopsy-derived extracellular vesicle protein biomarkers for diagnosis and prognostic assessment of lung squamous cell carcinoma. Cancer Cell Int 25, 161 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12935-025-03792-0
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12935-025-03792-0