Screening of key pathogenic genes of type 1 diabetes in children
• The key pathogenic genes CCL25 and EGFR have good diagnostic efficacy in children with T1DM.
What is known and what is new?
• Sequencing technology, bioinformatics, and machine learning algorithms promote the identification of potential disease-causing genes and the exploration of disease pathogenesis.
• This study utilizes the sequencing data from PBMC samples of children with T1DM from a public database and employs WGCNA to screen for key pathogenic genes.
What is the implication, and what should change now?
• The key pathogenic genes of T1DM in children can be identified by WGCNA. CCL25 and EGFR are involved in the occurrence and progression of T1DM in children.
Diabetes is a heterogeneous group of diseases that pose a significant threat to human health regardless of etiology, pathogenesis, or clinical phenotype (1,2). Type 1 diabetes mellitus (T1DM) constitutes 1–15% of all forms of diabetes. T1DM in children is an autoimmune disease characterized by progressive destruction of islets of Langerhans-derived β cells (3,4). Thus, patients need to rely on insulin therapy due to a significant reduction or complete lack of insulin secretion (5). The international community recognizes the importance of strengthening the prevention and control of T1DM in children (6,7). Precision medicine provides a promising approach to achieve this goal, with recent studies demonstrating the potential of vitamin D supplementation during pregnancy and the neonatal period to reduce T1DM risk (8,9). A study reported that CD3 monoclonal antibody injection could delay the onset of T1DM in children (10). Accurate prevention and treatment of T1DM requires the identification of key pathogenic genes (11), which can serve as biological markers for early diagnosis and typing, as well as therapeutic targets.
The specific pathological mechanism of T1DM in pediatric populations remains elusive. Currently, there is a lack of relevant research on screening key pathogenic genes based on sequencing data and efficient algorithms. However, recent advancements in sequencing technology, bioinformatics, and machine learning algorithms have enabled the identification of potential disease-causing genes and further exploration of disease pathogenesis. For example, through the analysis of sequencing data, common pathogenic genes for diabetes and kidney cancer have been identified (12). Given the difficulty in obtaining pancreatic tissue samples, it is believed that peripheral blood-based immunological markers could be utilized for the diagnosis of T1DM in children. This study utilizes the sequencing data from peripheral blood mononuclear cell (PBMC) samples of children with T1DM from a public database and employs weighted gene correlation network analysis (WGCNA) to screen for key pathogenic genes. The diagnostic efficacy of these key pathogenic genes in children with T1DM is evaluated and the results of this study provide valuable insights into the pathogenesis of T1DM in children. We present the following article in accordance with the STREGA reporting checklist (available at https://tp.amegroups.com/article/view/10.21037/tp-23-201/rc).
The full transcriptome sequencing results of PBMCs from children with T1DM were obtained from the Gene Expression Omnibus (GEO) database (GSE156035), comprising 20 T1DM samples and 20 normal samples. The corresponding platform file is GPL20844. In this study, the T1DM samples were designated as the treatment group and the normal samples as the control group. According to the platform file information, convert probe names to gene names. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
R software (version 3.5.1; The R Foundation for Statistical Computing, Vienna, Austria) and the relevant R packages were utilized to identify differentially expressed genes (DEGs) between the T1DM and normal samples. The fold change (FC) is the ratio of the average gene expression level in the experimental group to the average gene expression level in the control group. The DEGs screening criteria for DEGs were established as a FC greater than 1.5 times and an adjusted P value less than 0.05.
The DEGs identified through the R software and R package were further analyzed for functional enrichment using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) databases. The screening criterion is a q-value <0.05.
Key gene screening
To identify key genes in this study, the weighted gene co-expression network analysis (WGCNA) package in R software was used to construct a gene co-expression network. First, the Pearson correlation matrix of all gene pairs was calculated to construct the correlation matrix. Then, the correlation matrix was transformed into an adjacency matrix using a power function, and a scale-free topology was achieved using a soft threshold β. The adjacency matrix was transformed into a topological overlap matrix (TOM), and the gene-to-gene dissimilarity matrix (dissTOM =1 − TOM) was calculated. Hierarchical clustering was performed on the dissTOM to obtain a systematic clustering tree, grouping genes with similar expression into the same cluster. The Dynamic Tree Cut dynamic pruning algorithm was used to distinguish co-expression modules, with the minimum value of genes in the module set to 30, and the calculation of module eigengene (ME) values was performed to merge modules with highly similar clustering. This study employed 2 methods to identify modules related to clinical phenotypes. The first method involves calculating the correlation coefficient and P value of the module characteristic genes with disease phenotypes to determine key modules. The second method involves calculating the gene significance (GS) and module significance (MS) to determine key modules. GS refers to the correlation coefficient between the expression of a gene and a clinical information, whereas MS refers to the average of the GS of all genes in the module. Generally, the greater the absolute value of MS and GS for all modules, the stronger the correlation with the disease. Modular membership (MM) is used to measure the importance of a gene in a module. The standard for screening hub genes is MM >0.8 and GS >0.5.
R software (V3.5.1) and relevant R packages were used for statistical analysis. Comparisons between groups were performed using the rank-sum test. The diagnostic efficiency of the key genes for childhood T1DM was evaluated using a receiver operating characteristic (ROC) curve. The larger the area under the curve (AUC), the better the gene diagnostic efficiency. A two-sided P value of less than 0.05 is considered statistically significant.
Screening of DEGs
In this study, 293 DEGs were screened using the criteria of |log2FC| ≥0.585 and false discovery rate (FDR) <0.05. Compared to the control group samples, 94 genes were down-regulated and 199 genes were up-regulated in the study group samples (Figure 1).
GO enrichment analysis based on DEGs
GO enrichment analysis showed that DEGs were significantly enriched in biological process (BP) such as cartilage development, digestion, connective tissue development, response to xenobiotic stimulus, and tissue homeostasis. DEGs were also significantly enriched in cellular component (CC) such as collagen-containing extracellular matrix (ECM), collagen trimer, endoplasmic reticulum lumen, complex of collagen trimers, and apical part of cell. Additionally, DEGs were significantly enriched in molecular function (MF) such as ECM structural constituent, glycosaminoglycan binding, ECM structural constituent conferring tensile strength, peptidase regulator activity, and carbonate dehydratase activity (Figure 2).
KEGG enrichment analysis based on DEGs
KEGG enrichment analysis showed that DEGs were significantly enriched in pathways such as gastric acid secretion, ECM-receptor interaction, protein digestion and absorption, retinol metabolism, drug metabolism-cytochrome P450, focal adhesion, collecting duct acid secretion, glycolysis/gluconeogenesis, tyrosine metabolism, and metabolism of xenobiotics by cytochrome P450 (Figure 3).
Screening hub gene based on WGCNA
The study used WGCNA to analyze gene expression data and identified 9 gene modules. The soft threshold was set at β=5 when the R2 of the scale-free topology model first reached 0.9, and the average connectivity was high and contained sufficient information (Figure 4). Based on this, a hierarchical clustering tree and co-expression module (Figure 5) of WGCNA network were constructed, and 9 gene modules were finally obtained. The study selected modules based on the criteria of a correlation coefficient of |Cor| >0.5 and P<0.05. The black module (Cor =0.52, P=2e−12) was positively correlated with diabetic traits, whereas the brown (Cor =−0.51, P=5e−12) and pink (Cor =−0.53, P=5e−13) modules were negatively correlated (Figure 6). The black module contained 15 hub genes selected by the criteria of MM >0.8 and GS >0.5 (Figure 7), the brown module contained 52 hub genes (Figure 8), and the pink module contained 9 hub genes (Figure 9). The intersection of the hub genes and DEGs included 2 genes, CCL25 and EGFR.
CCL25 and EFGR diagnostic effectiveness prediction
CCL25 and EGFR were lowly expressed in the control group and highly expressed in the test group (P<0.001, Figure 10A,10B). The AUC of CCL25 and EGFR was 0.852 and 0.867 respectively (P<0.05, Figure10C,10D).
More than 75% of patients with T1DM exhibit onset in childhood, which is preceded by a long evolution process (13,14). Early prediction of the disease is crucial for screening high-risk individuals and timely immunological intervention. Therefore, early diagnostic markers for T1DM are of utmost importance. T1DM is a genetic and environmental interaction disease caused by multiple gene interactions, including abnormal T cell and B cell apoptosis, leading to pancreatic β cell damage.
A previous study explored key genes involved in the development of T1DM. One study used polymerase chain reaction (PCR) and Western blot techniques to examine the caspases-3 messenger RNA (mRNA) and protein levels in PBMCs of T1DM patients (15), and found that the caspases-3 levels were reduced both in mRNA and protein expressions (15). The deficit of caspases-3 expression and function was related to T1DM development. Vendrame et al. (16) investigated the Fas resistance mechanism in T1DM patients and showed that the activity of caspase-8 and caspase-9 decreased in T1DM patients, and the level of Fas resistance was directly related to the early or late invasion of self-immunity, meaning that the higher the Fas resistance, the earlier the onset of T1DM. In recent years, the development of high-throughput sequencing technology and machine learning algorithms has provided new insights into identifying key genes involved in T1DM pathogenesis.
This study identified the key pathogenic genes of T1DM in children through WGCNA, including CCL25 and EGFR. We believe that CCL25 and EGFR are involved in the occurrence and progress of T1DM in children.
CCL25 is a classic chemoattractant factor that can drive immune cell accumulation in specific regions (17). Its ligand is CCR9 (17). CCL25-CCR9 participates in T cell development and migration, thus promoting inflammatory responses (18,19). The interaction between CCL25 and CCR9 can also activate Akt/protein kinase B to inhibit cell apoptosis and support T cell survival during thymic maturation (20). The mechanism of T1DM described in previous studies is consistent with our results. The immunopathogenesis of T1DM begins with the decline of self-tolerance, the presence of self-antibodies against β cell antigens leading to β cell destruction mediated by T cells. This peripheral self-tolerance breach is mainly manifested in the activation, proliferation, and differentiation of autoreactive T cells into highly pathogenic effector T cells and memory T cells in the circulation of β cells. CCL25 is a key regulatory point in this process. However, there has been no direct evidence to prove CCL25 can promote the recruitment of self-reactive T cells to the pancreatic site. This still requires further experimental validation.
EGFR, one of the members of the epidermal growth factor receptor family (21-23), specifically binds to ligands such as epidermal growth factor, transforming growth factor, bidirectional regulating protein, B cell factor, liver-binding EGF-like growth factor, and epidermal regulation factor (24,25). EGFR is involved in the regulation of mechanisms for various inflammatory reactions (26-28). EGFR can induce the expression of metalloproteinase through the Toll-like receptor family signaling pathway (29). Upon binding to a ligand, EGFR forms a dimer and activates downstream MAPK and PI3K signaling pathways by phosphorylation, which regulates transcription of cytokines (30-33). EGFR can regulate a series of inflammatory factors, including interleukin-1, -6, -8, and tumor necrosis factor. There is no literature describing the correlation between EGFR and childhood T1DM. Further research is still needed to explore the pathogenic mechanisms of EGFR in childhood T1DM.
There is still a lack of relevant research elucidating the MFs of CCL25 and EGFR in childhood T1DM and other types of diabetes. This study only confirms their diagnostic efficacy. CCL25 and EGFR are key factors in promoting inflammatory responses, and blocking the MFs of CCL25 and EGFR may help suppress inflammatory reactions, assisting in the treatment of childhood T1DM. CCL25 and EGFR are potential therapeutic targets. It should also be noted that due to different screening criteria, the DEGs in this study are not entirely consistent with those in the study by Santos et al. (34). That study (34) suggests that inflammatory responses play an important role in the pathogenesis of childhood T1DM, which is consistent with our research findings.
This study has some limitations. Firstly, the sample size of this study is relatively small, and more data could not be obtained from public databases due to the limited attention paid to childhood T1DM. Secondly, the lack of external data validation calls for further establishment of an external cohort to validate the diagnostic efficiency of the key genes and conduct experiments with large sample sizes. Furthermore, this study lacks prognostic information of the samples, preventing an in-depth investigation of the impact of key genes on patient prognosis. Finally, we have not been able to clarify the pathogenic mechanisms of the key genes, and further basic experiments are still needed to explore the MFs and related mechanism pathways of the key genes CCL25 and EGFR in the onset and progression of the disease.
In conclusion, this study identified key pathogenic genes for pediatric T1DM using WGCNA, including CCL25 and EGFR. Both of them are upregulated in blood samples of childhood T1DM and have good diagnostic efficacy for childhood T1DM, serving as potential biological markers.
Reporting Checklist: The authors have completed the STREGA reporting checklist. Available at https://tp.amegroups.com/article/view/10.21037/tp-23-201/rc
Peer Review File: Available at https://tp.amegroups.com/article/view/10.21037/tp-23-201/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tp.amegroups.com/article/view/10.21037/tp-23-201/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- Fralick M, Jenkins AJ, Khunti K, et al. Global accessibility of therapeutics for diabetes mellitus. Nat Rev Endocrinol 2022;18:199-204. [Crossref] [PubMed]
- Cloete L. Diabetes mellitus: an overview of the types, symptoms, complications and management. Nurs Stand 2022;37:61-6. [Crossref] [PubMed]
- Beltrand J, Busiah K, Vaivre-Douret L, et al. Neonatal Diabetes Mellitus. Front Pediatr 2020;8:540718. [Crossref] [PubMed]
- Neu A, Bürger-Büsing J, Danne T, et al. Diagnosis, Therapy and Follow-Up of Diabetes Mellitus in Children and Adolescents. Exp Clin Endocrinol Diabetes 2019;127:S39-72. [Crossref] [PubMed]
- Czenczek-Lewandowska E, Grzegorczyk J, Mazur A. Physical activity in children and adolescents with type 1 diabetes and contem-porary methods of its assessment. Pediatr Endocrinol Diabetes Metab 2018;24:179-84. [Crossref] [PubMed]
- Pierce JS, Kozikowski C, Lee JM, et al. Type 1 diabetes in very young children: a model of parent and child influences on management and outcomes. Pediatr Diabetes 2017;18:17-25. [Crossref] [PubMed]
- Monaghan M, Bryant BL, Inverso H, et al. Young Children with Type 1 Diabetes: Recent Advances in Behavioral Research. Curr Diab Rep 2022;22:247-56. [Crossref] [PubMed]
- Manousaki D, Harroud A, Mitchell RE, et al. Vitamin D levels and risk of type 1 diabetes: A Mendelian randomization study. PLoS Med 2021;18:e1003536. [Crossref] [PubMed]
- Mitri J, Pittas AG. Vitamin D and diabetes. Endocrinol Metab Clin North Am 2014;43:205-32. [Crossref] [PubMed]
- Mignogna C, Maddaloni E, D'Onofrio L, et al. Investigational therapies targeting CD3 for prevention and treatment of type 1 diabetes. Expert Opin Investig Drugs 2021;30:1209-19. [Crossref] [PubMed]
- Fan W, Pang H, Xie Z, et al. Circular RNAs in diabetes mellitus and its complications. Front Endocrinol (Lausanne) 2022;13:885650. [Crossref] [PubMed]
- Dong Y, Zhai W, Xu Y. Bioinformatic gene analysis for potential biomarkers and therapeutic targets of diabetic nephropathy associated renal cell carcinoma. Transl Androl Urol 2020;9:2555-71. [Crossref] [PubMed]
- Siller AF, Tosur M, Relan S, et al. Challenges in the diagnosis of diabetes type in pediatrics. Pediatr Diabetes 2020;21:1064-73. [Crossref] [PubMed]
- Jackson S, Creo A, Al Nofal A. Management of Type 1 Diabetes in Children in the Outpatient Setting. Pediatr Rev 2022;43:160-70. [Crossref] [PubMed]
- De Franco S, Chiocchetti A, Ferretti M, et al. Defective function of the Fas apoptotic pathway in type 1 diabetes mellitus correlates with age at onset. Int J Immunopathol Pharmacol 2007;20:567-76. [Crossref] [PubMed]
- Vendrame F, Santangelo C, Misasi R, et al. Defective lymphocyte caspase-3 expression in type 1 diabetes mellitus. Eur J Endocrinol 2005;152:119-25. [Crossref] [PubMed]
- Wu X, Sun M, Yang Z, et al. The Roles of CCR9/CCL25 in Inflammation and Inflammation-Associated Diseases. Front Cell Dev Biol 2021;9:686548. [Crossref] [PubMed]
- Xu B, Deng C, Wu X, et al. CCR9 and CCL25: A review of their roles in tumor promotion. J Cell Physiol 2020;235:9121-32. [Crossref] [PubMed]
- Atanes P, Lee V, Huang GC, et al. The role of the CCL25-CCR9 axis in beta-cell function: potential for therapeutic intervention in type 2 diabetes. Metabolism 2020;113:154394. [Crossref] [PubMed]
- Svensson M, Agace WW. Role of CCL25/CCR9 in immune homeostasis and disease. Expert Rev Clin Immunol 2006;2:759-73. [Crossref] [PubMed]
- Sharifi J, Khirehgesh MR, Safari F, et al. EGFR and anti-EGFR nanobodies: review and update. J Drug Target 2021;29:387-402. [Crossref] [PubMed]
- Rayego-Mateos S, Rodrigues-Diez R, Morgado-Pascual JL, et al. Role of Epidermal Growth Factor Receptor (EGFR) and Its Ligands in Kidney Inflammation and Damage. Mediators Inflamm 2018;2018:8739473. [Crossref] [PubMed]
- Kirchner M, Kluck K, Brandt R, et al. The immune microenvironment in EGFR- and ERBB2-mutated lung adenocarcinoma. ESMO Open 2021;6:100253. [Crossref] [PubMed]
- Komposch K, Sibilia M. EGFR Signaling in Liver Diseases. Int J Mol Sci 2015;17:30. [Crossref] [PubMed]
- Vecchione L, Jacobs B, Normanno N, et al. EGFR-targeted therapy. Exp Cell Res 2011;317:2765-71. [Crossref] [PubMed]
- Lisi S, Sisto M, Ribatti D, et al. Chronic inflammation enhances NGF-β/TrkA system expression via EGFR/MEK/ERK pathway activation in Sjögren's syndrome. J Mol Med (Berl) 2014;92:523-37. [Crossref] [PubMed]
- Wang X, Reyes ME, Zhang D, et al. EGFR signaling promotes inflammation and cancer stem-like activity in inflammatory breast cancer. Oncotarget 2017;8:67904-17. [Crossref] [PubMed]
- Shimoda M, Horiuchi K, Sasaki A, et al. Epithelial Cell-Derived a Disintegrin and Metalloproteinase-17 Confers Resistance to Colonic Inflammation Through EGFR Activation. EBioMedicine 2016;5:114-24. [Crossref] [PubMed]
- van der Post S, Birchenough GMH, Held JM. NOX1-dependent redox signaling potentiates colonic stem cell proliferation to adapt to the intestinal microbiota by linking EGFR and TLR activation. Cell Rep 2021;35:108949. [Crossref] [PubMed]
- Corcoran RB, Ebi H, Turke AB, et al. EGFR-mediated re-activation of MAPK signaling contributes to insensitivity of BRAF mutant colorectal cancers to RAF inhibition with vemurafenib. Cancer Discov 2012;2:227-35. [Crossref] [PubMed]
- Yang W, Chen N, Li L, et al. Favorable Immune Microenvironment in Patients with EGFR and MAPK Co-Mutations. Lung Cancer (Auckl) 2020;11:59-71. [Crossref] [PubMed]
- Lee JH, Liu R, Li J, et al. EGFR-Phosphorylated Platelet Isoform of Phosphofructokinase 1 Promotes PI3K Activation. Mol Cell 2018;70:197-210.e7. [Crossref] [PubMed]
- Guerra F, Quintana S, Giustina S, et al. Investigation of EGFR/pi3k/Akt signaling pathway in seminomas. Biotech Histochem 2021;96:125-37. [Crossref] [PubMed]
- Santos AS, Cunha-Neto E, Gonfinetti NV, et al. Prevalence of Inflammatory Pathways Over Immuno-Tolerance in Peripheral Blood Mononuclear Cells of Recent-Onset Type 1 Diabetes. Front Immunol 2021;12:765264. [Crossref] [PubMed]
(English Language Editor: J. Jones)