Disease severity | Datasets |
|
GSE196822 (n = 49) | GSE166424 (n = 38) | |
Healthy | 8 | 2 |
Asymptomatic COVID-19 | 8 | 30 |
Mild COVID-19 | 9 | 2 |
Moderate COVID-19 | 10 | 2 |
Severe COVID-19 | 7 | 2 |
Covid-bacterial coinfection | 5 | 0 |
Citation: Shuyi Cen, Kaiyou Fu, Yue Shi, Hanliang Jiang, Jiawei Shou, Liangkun You, Weidong Han, Hongming Pan, Zhen Liu. A microRNA disease signature associated with lymph node metastasis of lung adenocarcinoma[J]. Mathematical Biosciences and Engineering, 2020, 17(3): 2557-2568. doi: 10.3934/mbe.2020140
[1] | Marwa M. Esawy, Amir Abd-elhameed, Alshimaa L. Abdallah, Maha E. Alsadik, Elsayed S. Abd elbaser, Marwa A. Shabana, Rania M. Abdullah . Long noncoding RNA HULC is an independent predictor of COVID-19 severity and mortality in relation to microRNA-9 and IL-6. AIMS Molecular Science, 2022, 9(2): 79-90. doi: 10.3934/molsci.2022005 |
[2] | Amal Feiroze Farouk, Areez Shafqat, Shameel Shafqat, Junaid Kashir, Khaled Alkattan, Ahmed Yaqinuddin . COVID-19 associated cardiac disease: Is there a role of neutrophil extracellular traps in pathogenesis?. AIMS Molecular Science, 2021, 8(4): 275-290. doi: 10.3934/molsci.2021021 |
[3] | Amrit Krishna Mitra . Familiar fixes for a modern malady: a discussion on the possible cures of COVID-19. AIMS Molecular Science, 2020, 7(3): 269-280. doi: 10.3934/molsci.2020012 |
[4] | Archana M Navale, Vanila Devangan, Arpit Goswami, Vikas Sahu, Lavanya S, Devanshu Patel . Effects of pre-existing metformin therapy on platelet count, serum creatinine, and hospitalization in COVID-19 patients with diabetes mellitus. AIMS Molecular Science, 2023, 10(4): 311-321. doi: 10.3934/molsci.2023018 |
[5] | Ahmed Yaqinuddin, Abdul Hakim Almakadma, Junaid Kashir . Kawasaki like disease in SARS-CoV-2 infected children – a key role for neutrophil and macrophage extracellular traps. AIMS Molecular Science, 2021, 8(3): 174-183. doi: 10.3934/molsci.2021013 |
[6] | Amir Khodavirdipour, Fariba Keramat, Seyed Hamid Hashemi, Mohammad Yousef Alikhani . SARS-CoV-2; from vaccine development to drug discovery and prevention guidelines. AIMS Molecular Science, 2020, 7(3): 281-291. doi: 10.3934/molsci.2020013 |
[7] | Mohd Shukri Abd Shukor, Mohd Yunus Abd Shukor . Molecular docking and dynamics studies show: Phytochemicals from Papaya leaves extracts as potential inhibitors of SARS–CoV–2 proteins targets and TNF–alpha and alpha thrombin human targets for combating COVID-19. AIMS Molecular Science, 2023, 10(3): 213-262. doi: 10.3934/molsci.2023015 |
[8] | Irene Mwongeli Waita, Atunga Nyachieo, Daniel Chai, Samson Muuo, Naomi Maina, Daniel Kariuki, Cleophas M. Kyama . Differential expression and functional analysis of micro RNAs in Papio anubis induced with endometriosis for early detection of the disease. AIMS Molecular Science, 2020, 7(4): 305-327. doi: 10.3934/molsci.2020015 |
[9] | Fumiaki Uchiumi, Makoto Fujikawa, Satoru Miyazaki, Sei-ichi Tanuma . Implication of bidirectional promoters containing duplicated GGAA motifs of mitochondrial function-associated genes. AIMS Molecular Science, 2014, 1(1): 1-26. doi: 10.3934/molsci.2013.1.1 |
[10] | Kristyn Alissa Bates . Gene-environment interactions in considering physical activity for the prevention of dementia. AIMS Molecular Science, 2015, 2(3): 359-381. doi: 10.3934/molsci.2015.3.359 |
In late 2019, the emergence of COVID-19, driven by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), triggered a global pandemic. Within 210 days, this virus had already claimed over 761,779 lives worldwide. SARS-CoV-2 primarily targets the upper respiratory tract, subsequently causing substantial damage to the lower respiratory tract, often resulting in severe pneumonia [1],[2]. Vulnerable populations, particularly the elderly and those with preexisting comorbidities, face a heightened risk of severe health complications, marked by cytokine up-regulation and the onset of acute respiratory distress syndrome (ARDS) [3]. A substantial challenge in the management of COVID-19 lies in the stratification of disease severity and the subsequent customization of treatment strategies [4],[5]. COVID-19 encompasses a spectrum of severity, categorized into four distinct levels: asymptomatic, mild, moderate, and severe [6],[7]. Each of these categories is characterized by unique clinical presentations, accompanied by differential gene regulation. Numerous research has been undertaken to identify the molecular markers closely linked to the severity of COVID-19 [8]. Among these markers, immunological indicators such as atypical immune cell counts and proinflammatory cytokine levels play a pivotal role in distinguishing between disease severities. Uniform treatment strategies, such as the use of antiviral medications, corticosteroids, or monoclonal antibodies, frequently are proved inadequate in accommodating the wide array of clinical presentations and immune reactions observed among distinct COVID-19 patients [9],[10]. Throughout the COVID-19 pandemic, an extensive volume of transcriptomic data was amassed, encompassing RNA-sequencing datasets that delineate gene expression patterns in diverse COVID-19 patients [11]. Transcriptomic data analysis serves as a pivotal tool in discerning the molecular distinctions among patients experiencing varying degrees of COVID-19 severity, including asymptomatic, mild, moderate, and severe cases [12]. By scrutinizing the transcriptome, which encompasses the complete array of RNA molecules within a patient's cells, researchers can glean valuable insights into the underlying molecular mechanisms contributing to the diverse clinical outcomes observed in COVID-19 [13]. This analytical process typically entails the extraction of RNA from patient specimens, such as blood or respiratory tissues, followed by the application of high-throughput sequencing methods like RNA-seq to quantify the expression levels of thousands of genes [11]–[14]. Subsequently, bioinformatics tools are harnessed to pinpoint differentially expressed genes and pathways linked to disease severity. Comparative analysis of transcriptomic profiles across these patient groups unveils pivotal genes and regulatory pathways associated with immune responses, inflammation, viral replication, and tissue damage, thereby illuminating the molecular determinants influencing disease progression [15],[16]. These revelations hold the potential to facilitate the development of more precise diagnostic tools, therapeutic approaches, and personalized treatments, ultimately advancing our comprehension of COVID-19 pathogenesis and patient outcomes.
The pathogenesis of COVID-19 represents a multifaceted and intricate interplay of various factors that contribute to the variable clinical outcomes observed in infected individuals, ranging from asymptomatic cases to severe disease presentations. To date, a comprehensive understanding of the underlying mechanisms governing the diverse disease severity spectrum remains elusive. This research endeavors to bridge this knowledge gap by focusing on the identification and characterization of pivotal genes that play a substantial role in the pathophysiological processes associated with distinct disease severities. By elucidating the molecular determinants and their interactions, we aim to shed light on the intricate regulatory networks and pathways that are modulated by these genes. Such insights may facilitate the development of targeted interventions and therapeutic strategies, ultimately aiding in the effective management of COVID-19 across its clinical spectrum.
In this research, we utilized the Gene Expression Omnibus (GEO) repository, a prominent public resource for the archival and dissemination of high-throughput functional genomic data derived from various techniques, including microarray and next-generation sequencing (NGS) [17]. Our study, focused on COVID-19 investigation, harnessed two specific datasets, namely GSE196822 and GSE166424, both of which were processed through the method of expression profiling by high-throughput sequencing. The GSE196822 dataset encompassed a total of 49 samples, while the GSE166424 dataset included 38 samples. All samples in these datasets were sequenced utilizing the GPL20301 Illumina HiSeq 4000 platform, specifically designed for Homo sapiens.
The primary objective of this research is to elucidate the genetic underpinnings associated with the diverse clinical presentations observed in COVID-19, ranging from asymptomatic cases to severe manifestations. To accomplish this, a systematic categorization of the datasets has been undertaken, with a focus on disease severity stratification. This categorization facilitates the segregation of datasets into distinct groups corresponding to the varying clinical spectrums of COVID-19 severity. The detailed breakdown of dataset categorization, which delineates the criteria for severity assessment, is comprehensively documented in Table 1. Within the GSE196822 dataset, one sample from the healthy and COVID-bacterial coinfection category was unavailable for analysis through the GEO2R platform. Consequently, these specific data points were excluded from the subsequent analytical procedures.
Disease severity | Datasets |
|
GSE196822 (n = 49) | GSE166424 (n = 38) | |
Healthy | 8 | 2 |
Asymptomatic COVID-19 | 8 | 30 |
Mild COVID-19 | 9 | 2 |
Moderate COVID-19 | 10 | 2 |
Severe COVID-19 | 7 | 2 |
Covid-bacterial coinfection | 5 | 0 |
The raw datasets underwent preprocessing steps within the GEO2R platform, employing the R package robust multiarray (RMA). Subsequently, differential expression analysis was carried out, involving the application of cutoff criteria. Specifically, differentially expressed genes (DEGs) were extracted from the preprocessed dataset based on variables exhibiting a statistical significance with adjusted p-values less than or equal to 0.05 and log-fold change values exceeding +1.5 for upregulated genes and falling below −1.5 for downregulated genes [18].
To visualize the significant DEGs, volcano and mean difference (MD) plots were generated, employing a default adjusted p-value threshold of 0.05 with the Benjamini and Hochberg [false discovery rate (FDR)] method. In these plots, upregulated genes are denoted in red, while downregulated genes are represented in blue. The volcano plot displays the log2-fold change against statistical significance (−log10 p-value), while the mean difference plot presents the average log2 expression. These graphical representations were generated utilizing the Limma Package.
The top DEGs, identified as statistically significant, were subjected to a PPIN analysis using the STRING database. The STRING database is a comprehensive resource that integrates various types of protein relationships, encompassing both biological and physical interactions. A functional protein association network was constructed for the multiple DEGs with Homo sapiens as the reference organism [19]. The network was built with a high-confidence score threshold of 0.900 to ensure robustness and reliability. The active interactions within the network were derived from diverse sources, including text mining, experimental data, established databases, co-expression patterns, physical proximity in the genome, gene fusion events, and co-occurrence in previous studies. To further explore and visualize the significant interactions within the network, a k-means clustering approach was applied, maintaining a confidence score of 0.900, to identify major interactors.
The statistically significant DEGs were subjected to KEGG and gene ontology (GO) analysis. To maintain the study's biological relevance, we used the human species as the reference organism, and the FDR cutoff was set at 0.05, minimizing the likelihood of false positives [20].
While the GSE196822 dataset includes samples from COVID-19 patients in India and GSE166424 includes samples from asymptomatic COVID-19 patients in Singapore, both datasets encompass a diverse range of disease severities and demographic characteristics. The inclusion of samples from different regions allows for the exploration of potential regional variations in disease presentation and molecular profiles.
By utilizing transcriptomic data from distinct geographical locations, our study has the unique opportunity to conduct cross-validation and comparative analyses. Through these analyses, we aim to identify common molecular pathways underlying COVID-19 pathogenesis, as well as unique regional signatures that may contribute to variations in disease severity and outcomes. This approach not only enhances the robustness and generalizability of our findings but also provides valuable insights into the global dynamics of COVID-19 and the molecular mechanisms driving its clinical manifestations.
Background correction and normalization are essential data preprocessing steps in microarray analysis to ensure the accuracy and reliability of results. Background correction is performed to eliminate non-specific signals or noise in the microarray data. Microarray experiments may capture background noise, which can affect the quantification of gene expression levels. By subtracting the background signal, the accuracy of gene expression measurements will be greatly improved. Normalization is necessary to remove systematic variations between samples, such as differences in labeling efficiency, hybridization conditions, or scanner sensitivity. It ensures that the expression levels across different samples are directly comparable, making it possible to identify true biological differences. The observation of median-centered values in the boxplot (Figure 1) serves as an indicator of successful data normalization and cross-comparability. It signifies that the datasets have been effectively processed, aligning them for downstream DGE analysis.
Analysis of differential gene expression is the principal application of RNA data sequencing. Genes that are variably controlled under different conditions can be identified using this strategy. The modified p-value was determined using the Benjamini-Hochberg method. For inclusion, an adjusted p-value of less than 0.05 was deemed significant. The log-fold change threshold was set at |logFC| > 1.5, indicating that entries with logFC > 1.5 were upregulated and those with logFC < 1.5 were downregulated. A volcano plot was constructed utilizing the Limma package (Figure 2) to visually depict the relationship between statistical significance (−log10 p-value) and the magnitude of change, as represented by the log2-fold change values. This graphical representation serves as an effective means to identify genes exhibiting substantial changes in expression while also highlighting their statistical significance.
As previously outlined, the core objective of our study is to discern the genes associated with the clinical manifestations of asymptomatic, mild, moderate, and severe cases of COVID-19. To achieve this, we conducted a differential gene expression analysis, identifying the DEGs for each of the aforementioned clinical categories as follows.
The genes associated with various aspects of immune and inflammatory responses were found to be upregulated in asymptomatic COVID-19. DEFA5, DEFA8P, PRTN3, CTSG, and ELANE are defensin genes, which are known for their antimicrobial and immunomodulatory functions (Supplementary Table 1). Conversely, the downregulated genes in asymptomatic COVID-19 may indicate a dampened immune response or a specific modulation of host gene expression in these cases. Among the downregulated genes, the representatives of the TRAJ family were found to be significantly downregulated (Supplementary Table 1).
The genes associated with neutrophils and their activation (e.g., DEFT1P, CD177, PRTN3, ELANE, and OLAH) were found to be associated with the clinical presentation of mild COVID-19 (Table 3). The downregulated genes in mild COVID-19 cases are less characterized (e.g., SUSD2, ZNF181, CELF2-AS1, CFAP97D2, NBEAL1, and LINC02067), and their functions in the context of this disease remain unclear. In mild COVID-19 patients, we observed a commonality in gene expression patterns. Specifically, genes such as PRTN3 and ELANE, which were found to be upregulated in asymptomatic cases, displayed a similar upregulation in mild cases (Supplementary Table 1 and Figure 3).
In moderate COVID-19 cases, the observed upregulation of immune-related genes (DEFT1P, OLAH, PCSK9, DAAM2, ADAMTS2, and DAAM2-AS1) and the downregulation of genes associated with epigenetic regulation and cellular functions (SLCO5A1, NRCAM, NOG, TET1, MAMDC2, VMO1, and SIRPG-AS1) indicates a complex interplay of molecular responses to the virus (Figure 3 and Supplementary Table 2). Genes such as TET1 and CFAP97D2, which exhibited downregulation in mild cases, displayed a similar downregulation in moderate cases (Figure 3).
Several of the genes that were upregulated display functions linked to immune and inflammatory responses. For instance, DEFT1P, OLAH, DEFA5, ADAMTS2, DEFT1P2, CD177, PCSK9, and MAOA are associated with various aspects of the immune system. The downregulated genes in severe COVID-19 include FCER1A, CPA5, LRRN3, TRBJ1-2, STEAP1B-AS1, NOG, TRAJ49, TRAJ54, ZFP37, and TRBV4-1. Many of these genes have diverse functions, with some associated with immune responses, cellular adhesion, and signaling pathways. The downregulation of TRBJ1-2, TRAJ49, and TRAJ54, which are related to T-cell-receptor gene segments, suggests potential immune suppression or altered T-cell responses in severe cases (Supplementary Table 2).
Notably, OLAH, ADAMTS2, DAAM2, DAAM2-AS1, MAOA, CD177, PCSK9, CLRN1-AS1, TIMP4, and VSIG4 are among the upregulated genes. The downregulated genes in severe COVID-bacterial coinfection, including CLEC4F, IFI27, VMO1, HLA-DPB2, SLCO5A1, HES4, LYPD2, SNORD141A, LINC02086, and an additional CLEC4F, suggest potential alterations in immune and cellular responses (Supplementary Table 3).
The data reveals intriguing patterns in gene expression across different COVID-19 severity levels and coinfection scenarios. Notably, the genes PRTN3 and ELANE show consistent upregulation in both asymptomatic and mild cases, indicating a shared immune response that might contribute to viral clearance without severe symptoms. In contrast, genes TET1 and CFAP97D2 are downregulated in mild and moderate cases, suggesting potential immune modulation in the transition between these severity levels (Supplementary Table 4 and Figure 3). The gene NOG, downregulated in both moderate and severe cases, plays a role in tissue homeostasis and cell differentiation. Its decreased expression could disrupt tissue repair mechanisms, possibly contributing to more severe clinical outcomes (Supplementary Table 4 and Figure 3). Additionally, in severe cases with bacterial coinfections, MAOA is upregulated, reflecting an altered host response, emphasizing the impact of coinfections on the immune response (Supplementary Table 4 and Figure 3).
LYPD2's consistent downregulation in coinfection and asymptomatic cases may signify immune suppression or altered responses to viral and bacterial pathogens (Supplementary Table 4 and Figure 3). In the asymptomatic-to-severe comparison, DEFA5's upregulation implies robust defense mechanisms, while TRAJ49 downregulation may affect immune responses by altering T-cell-receptor gene segment expression (Supplementary Table 4 and Figure 3). DEFT1P consistently shows upregulation in all three categories (asymptomatic, mild, and moderate), emphasizing its crucial role in mounting and maintaining immune responses (Supplementary Table 4 and Figure 3). Genes SLCO5A1 and SNORD141A exhibit shared downregulation across mild, moderate, and coinfection cases, suggesting potential immune suppression (Supplementary Table 4 and Figure 3). On the other hand, CD177 is upregulated in cases with mild, severe, and coinfections, implying a common immune response signature (Supplementary Table 4 and Figure 3).
The PPIN analysis (Figure 4) represents the interactions between the top DEGs. These interactions reveal how different proteins encoded by these genes collaborate and communicate within a cellular context. The presence of the RPL (ribosomal protein large subunit) family as a major interaction (Figure 4) suggests a significant role of ribosomal proteins. Their prominence may indicate an upregulation of protein synthesis or specific cellular processes related to ribosomal functions in COVID-19. The identification of the CEACAM (carcinoembryonic antigen-related cell adhesion molecule) family as a major interaction (Figure 4) implies the importance of cell adhesion and intercellular communication. The identification of the NLR (NOD-like receptor) protein family as a major interaction (Figure 4) implies an association with the immune system and inflammation. NLR proteins are involved in the detection of pathogens and cellular stress, playing a key role in the regulation of innate immunity. Their presence may indicate the activation of specific immune pathways in response to COVID-19.
In the KEGG pathway analysis, the results reveal a multifaceted landscape of pathway involvement in COVID-19. Notably, the presence of "transcriptional misregulation in cancer" suggests potential links to cancer-related gene expression dysregulation. “Staphylococcus aureus infection” points to the condition's influence on host responses to bacterial infections. The “NOD-like receptor signaling pathway” underscores an activated innate immune response. The inclusion of "neutrophil extracellular trap (NET) formation" implies an ongoing immune response featuring neutrophil engagement (Figure 5).
The biological processes, cellular components, and molecular functions highlighted in the analysis collectively underscore the prominent role of immune responses and granule activation in the condition. This signifies an active immune response, potentially involving neutrophils and regulated exocytosis of cellular components. The significance of specific granules, lysosomes, and secretory vesicles as cellular components underscores their integral role in the studied condition, while the presence of azurophil granules and secretory vesicles further supports an active immune response (Figure 5). Additionally, the molecular functions related to receptor activity, including prostaglandin and prostanoid receptor activity, suggest the involvement of signaling pathways associated with these receptors. The identification of carbohydrate binding and glycosaminoglycan binding implies interactions with carbohydrates and proteoglycans in the molecular landscape of the condition (Figure 5).
The interactions between the SARS-CoV-2 virus and human proteins by utilizing the IntAct database. The IntAct database serves as a comprehensive repository of information related to human proteins and their interactions with genes. To discern the specific interactions between the virus and our protein of interest, we applied a rigorous filtering process. This filtering involved considering various criteria, such as the molecular interaction (MI) score, which helped us identify positive interactions, and detection methods employed to ensure the reliability of the interactions. Our focus was primarily on interactions that exhibited a direct and physical association between the virus and human proteins, indicative of a close and biologically significant connection.
These interactions occurred within various types of host cells, which provides valuable insights into the infection process. Specifically, the majority of host interactions were detected in embryonic kidney cells, in vitro cells (typically referring to cells cultured in a laboratory setting), carcinoma cells (which are cancer cells), and Aethiops simian cells (Supplementary Table 5), which are commonly used in virology research due to their susceptibility to viral infections. These observations underscore the relevance of these cellular environments in the context of SARS-CoV-2 infection and suggest that the virus may exhibit specific affinities for these particular cell types.
The global COVID-19 pandemic has created a profound health crisis, manifesting a spectrum of disease severity, spanning from asymptomatic cases to severe conditions, presenting diverse clinical symptoms [21]. However, the intricacies underlying the progression from asymptomatic to severe stages remain poorly understood, representing a significant research gap. Addressing this knowledge deficit is imperative, as it necessitates specialized investigations to discern the pivotal genes contributing to distinct disease severities and the accompanying modulated pathways [22]. The identification of such genes and pathways holds promise for the development of targeted therapeutic interventions based on these pathways. Additionally, understanding the progression of COVID-19 from an asymptomatic to a severe stage offers critical insights for devising effective prevention and treatment strategies [6],[7]. While prior studies have investigated the pathogenesis of COVID-19 and recognized potential therapeutic targets, knowledge gaps, and research priorities persist. One example is the elusive understanding of the long-term consequences of COVID-19, hindered by a scarcity of high-quality research, which remains a barrier to a comprehensive definition of conditions like long COVID and post-acute COVID.
To address the aforementioned research gap, we conducted a comprehensive analysis of DGE within COVID-19 datasets. This analysis enabled the identification of distinct gene profiles associated with varying severities of COVID-19.
Age, sample processing procedures, comorbidities, and medication usage are recognized as critical determinants that can significantly influence gene expression patterns in COVID-19 patients, necessitating careful consideration in the interpretation of findings. While our study did not explicitly account for age groups or specific comorbidity profiles in the transcriptomic analysis, we ensured the inclusion of a diverse range of individuals with varying disease severities and demographic characteristics within the GSE196822 and GSE166424 datasets. Future research endeavors could delve into subgroup analyses to elucidate differential gene expression patterns among different age groups or specific comorbidity profiles, thus providing further insight into disease pathogenesis. Moreover, variations in sample processing protocols and medication usage among patients have the potential to introduce biases in transcriptomic analyses. Although detailed clinical metadata and medication histories were not available in the GEO datasets utilized in our study, we implemented stringent quality-control measures during data preprocessing to mitigate technical artifacts and ensure data reliability.
In asymptomatic COVID-19, several genes that are related to immune and inflammatory responses were found to be upregulated. DEFA5, DEFA8P, PRTN3, CTSG, and ELANE are defensin genes, which are known for their antimicrobial and immunomodulatory functions [23]. The upregulation of these genes may suggest an activated immune response in asymptomatic cases, potentially aiding in controlling viral replication. ADAMTS2 is linked to extracellular matrix remodeling and could play a role in tissue repair and inflammation resolution [24]. PCSK9 may be involved in lipid metabolism and immune regulation [25]. Among the downregulated genes, TRAJ49, TRIM51EP, UICLM, LYDP2, IGLV4-60, TRDJ3, SHISA4, TRDV1, TRAJ10, and TRAJ33 do not have well-documented functions in the context of COVID-19. However, their downregulation may still suggest some level of immune suppression.
The upregulated genes in mild COVID-19, for example, DEFT1P, CD177, PRTN3, ELANE, and OLAH, are associated with neutrophils and their activation. These genes are critical components of the immune response, and their upregulation might indicate a heightened immune reaction in asymptomatic individuals, which could contribute to viral control without clinical symptoms [26]. TMEM54 and CLRN1 have roles that need further investigation in the context of COVID-19. The downregulated gene TET1 is involved in DNA demethylation, and its downregulation might influence epigenetic regulation [27]. SLCO5A1 is associated with organic anion transport and may have implications for drug metabolism or immune responses [28]. KLRB1 is associated with natural killer cells and T-cell function, and its downregulation may indicate immune suppression [29]. Other downregulated genes such as SUSD2, ZNF181, CELF2-AS1, CFAP97D2, NBEAL1, and LINC02067 are less characterized.
In the context of moderate COVID-19, the significant upregulation of PCSK9 is particularly noteworthy, as it is known to play a role in lipid metabolism and immune regulation. In the context of COVID-19, elevated PCSK9 expression may indicate a potential link between lipid metabolism and the immune response in moderate cases [25]. Conversely, the downregulated genes in moderate COVID-19 include SLCO5A1, SNORD141A, CFAP97D2, NRCAM, NOG, TET1, MAMDC2, VMO1, SIRPG-AS1, and an additional SLCO5A1. Some of these genes, such as TET1, are associated with epigenetic regulation, suggesting potential alterations in DNA methylation patterns in moderate cases [27]. Others, like SLCO5A1 and NRCAM, have roles in cell adhesion and transport processes, the downregulation of which could impact various cellular functions during infection [28].
The upregulated genes in severe COVID-19—DEFT1P, OLAH, DEFA5, ADAMTS2, DEFT1P2, CD177, PCSK9, and MAOA—are associated with various aspects of the immune system. DEFA5, DEFT1P, and CD177 are involved in neutrophil activation and host defense mechanisms [26]. Upregulation of these genes may indicate a heightened immune response in severe cases to combat the viral infection. MAOA is linked to the monoamine pathway and might be involved in regulating inflammatory responses [30]. The downregulation of TRBJ1-2, TRAJ49, and TRAJ54, which are related to T-cell-receptor gene segments, suggests potential immune suppression or altered T-cell responses in severe cases [31].
Lastly, in Covid-bacterial coinfection, OLAH and ADAMTS2 are genes associated with extracellular matrix remodeling that might be involved in tissue repair processes. CD177, PCSK9, and MAOA are known for their roles in immune regulation and lipid metabolism. Their upregulation may indicate an immune response and potential involvement in the host's defense against bacterial coinfection. DAAM2 and DAAM2-AS1 have functions that require further investigation in the context of coinfection. TIMP4 and VSIG4 are involved in tissue remodeling and immune regulation, respectively, and their upregulation suggests a potential immune response against the coinfection [32]. CLRN1-AS1's function in the context of coinfection remains to be elucidated. IFI27 is an interferon-induced gene, and its downregulation may indicate a modulation of the interferon response in coinfection.
In the present study, the genes ADAMTS2, PCSK9, and OLAH were found to be significantly upregulated across all disease severities. These findings align with past research wherein it is reported that ADAMTS2, a metalloprotease enzyme, is involved in the proinflammatory response to COVID-19 [24]. A study found that PCSK9, a gene that encodes for a protein involved in lipid metabolism, is upregulated in severe COVID-19 patients [33]. OLAH, a gene that encodes for an enzyme involved in lipid metabolism, was upregulated in severe COVID patients [33].
The proteins RPL, CEACAM, and NLR were identified as major interactors in the PPIN [34]. A study found that CEACAM5 and CEACAM6 are highly expressed in developing neutrophils/neutrophil progenitors in COVID-19 [35]. The study also found that CEACAM5 is an important surface-attachment factor that facilitates the entry of the Middle East respiratory syndrome coronavirus (MERS-CoV). Disrupting the interaction between CEACAM5 and MERS-CoV spiked with anti-CEACAM5 antibody, recombinant CEACAM5 protein, or small interfering RNA (siRNA) knockdown of CEACAM5 significantly inhibited the entry of MERS-CoV [36]. The neutrophil/lymphocyte ratio (NLR) has been suggested as a good predictive marker of disease severity and mortality in COVID-19 patients [37].
The findings from pathway analysis suggest that significant DEGs are involved in immune responses, particularly neutrophil-related processes, and may play a role in response to infections, both bacterial and viral. The presence of pathways related to cancer and coronavirus disease implies potential connections between the pathologies. These findings can guide further research into the molecular mechanisms underlying the studied condition and provide a foundation for understanding its pathogenesis and potential therapeutic targets.
ML algorithms have emerged as powerful tools for analyzing large datasets and uncovering complex relationships between patient characteristics and disease outcomes. In the context of COVID-19, ML holds promise for prognostic evaluation, biomarker identification, and treatment optimization [38]. By leveraging diverse data sources such as clinical records, imaging data, and genomic information, ML algorithms can assist in predicting disease severity, identifying relevant biomarkers, and personalizing treatment strategies. However, it is crucial to address certain challenges associated with the adoption of ML in clinical practice [39]. One major hurdle is the interpretability of ML models, often referred to as the “black box” problem, which limits clinicians' ability to understand and trust the predictions generated by these algorithms. Additionally, there is a need for explainable AI models that provide insights into the decision-making process of ML algorithms, enhancing their utility in clinical settings. Furthermore, efforts should be directed toward improving clinicians' understanding of ML models to facilitate their integration into routine practice [40]. Despite these challenges, the potential of ML in revolutionizing COVID-19 research and clinical care is substantial, and further exploration and refinement of ML-based approaches are warranted to unlock their full potential in improving patient outcomes.
The comprehensive analysis of differential gene expression in varying severities of COVID-19 has shed light on the intricate molecular mechanisms underlying the disease's progression. We observed distinct gene profiles associated with asymptomatic, mild, moderate, and severe COVID-19, as well as coinfections. Notably, genes like ADAMTS2, PCSK9, and OLAH were consistently upregulated across all disease severities, aligning with existing research on their roles in inflammation, lipid metabolism, and immune regulation. These findings emphasize the significance of immune responses, particularly neutrophil-related processes, in the context of COVID-19 severity. Understanding the gene expression patterns associated with different disease severities not only contributes to our comprehension of COVID-19 pathogenesis but also holds promise for the development of targeted therapeutic interventions.
The conclusions drawn from our study provide a foundational understanding of the molecular mechanisms underlying COVID-19 severity, offering valuable insights into potential avenues for future research and clinical applications. Moving forward, our findings could significantly impact the development of personalized medicine approaches for the treatment of severe COVID-19. Specifically, the identification of distinct gene expression profiles associated with varying disease severities presents an opportunity to tailor treatment strategies based on individual patient characteristics. By leveraging transcriptomic data and integrating it with clinical parameters, future research endeavors could refine prognostic models to predict disease progression and treatment response with greater accuracy. This personalized approach holds immense potential to optimize clinical decision-making, enabling healthcare professionals to tailor interventions based on the specific molecular signatures observed in patients. Strategies such as immunomodulation, anti-inflammatory therapy, or antiviral treatments could be tailored to target the specific molecular mechanisms identified in different severities of the disease, potentially improving clinical outcomes for patients. Additionally, the integration of multi-omics data, including genomic, transcriptomic, and proteomic data, could provide a comprehensive understanding of the complex interplay between genetic factors and disease severity, paving the way for more precise and effective treatment approaches. However, it is important to acknowledge the limitations of our present study. We must address the need for larger datasets and more extensive clinical validation to strengthen the generalizability of our findings.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
[1] | K. X. Sun, R. S. Zheng, H. M. Zeng, S. W. Zhang, X. N. Zou, X. Y. Gu, et al., The incidence and mortality of lung cancer in China, 2014, Zhonghua Zhong Liu Za Zhi, 40 (2018), 805-811. |
[2] | R. L. Siegel, K. D. Miller, A. Jemal, Cancer statistics, 2018, CA Cancer J. Clin., 68 (2018), 7-30. |
[3] | A. McIntyre, A. K. Ganti, Lung cancer-a global perspective, J. Surg. Oncol, 115 (2017), 550-554. |
[4] | A. Jemal, M. M. Center, C. DeSantis, E. M. Ward, Global patterns of cancer incidence and mortality rates and trends, Cancer Epidemiol. Biomarkers Prev., 19 (2010), 1893-1907. |
[5] | M. J. Duffy, Clinical uses of tumor markers: A critical review, Crit. Rev. Clin. Lab. Sci., 38 (2001), 225-262. |
[6] | P. Goldstraw, K. Chansky, J. Crowley, R. R. Porta, H. Asamura, W. E. Eberhardt, et al., The IASLC lung cancer staging project: Proposals for revision of the TNM stage groupings in the forthcoming (eighth) edition of the TNM classification for lung cancer, J. Thorac. Oncol, 11 (2016), 39-51. |
[7] | A. Matsuda, K. Katanoda, Five-year relative survival rate of lung cancer in the USA, Europe and Japan, Jpn J. Clin. Oncol, 43 (2013), 1287-1288. |
[8] | Y. Cai, X. Yu, S. Hu, J. Yu, A brief review on the mechanisms of miRNA regulation, Genomics, Proteomics Bioinf., 7 (2009), 147-154. |
[9] | E. Chan, D. E. Prado, J. B. Weidhaas, Cancer microRNAs: From subtype profiling to predictors of response to therapy, Trends Mol. Med., 17 (2011), 235-243. |
[10] | J. Ma, K. Mannoor, L. Gao, A. Tan, M. A. Guarnera, M. Zhan, et al., Characterization of microRNA transcriptome in lung cancer by next-generation deep sequencing, Mol. Oncol, 8 (2014), 1208-1219. |
[11] | A. E. Kerscher, F. J. Slack, Oncomirs-microRNAs with a role in cancer, Nat. Rev. Cancer, 6 (2006), 259-269. |
[12] | C. Sanfiorenzo, M. I. Ilie, A. Belaid, F. Barlesi, J. Mouroux, C. H. Marquette, et al., Two panels of plasma microRNAs as non-invasive biomarkers for prediction of recurrence in resectable NSCLC, PLoS One, 8 (2013), e54596. |
[13] | J. Y. Kwan, P. Psarianos, J. P. Bruce, K. W. Yip, F. F. Liu, The complexity of microRNAs in human cancer, J. Radiat. Res., 57 (2016), i106-i111. |
[14] | R. Tibshirani, Regression shrinkage and selection via the lasso, J. Roy. Stat. Soc. B, 58 (1996), 267-288. |
[15] | C. Chen, H. Chen, Y. He, R Xia, TBtools, a toolkit for biologists integrating various HTS-data handling tools with a user-friendly interface, Bio. Rxiv, (2018), 289660. |
[16] | H. Dweep, N. Gretz, miRWalk2. 0: A comprehensive atlas of microRNA-target interactions, Nat. Methods, 12 (2015), 697. |
[17] | P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, et al., Cytoscape: A software environment for integrated models of biomolecular interaction networks, Genome Res., 13 (2003), 2498-2504. |
[18] | F. Yang, K. Wei, Z. Qin, W. Liu, C, Shao, C. Wang, et al., MiR-598 suppresses invasion and migration by negative regulation of derlin-1 and epithelial-mesenchymal transition in non-small cell lung cancer, Cell Physiol. Biochem., 47 (2018), 245-256. |
[19] | H. Y. Lee, S. S. Han, S. Y. Song, Serum microRNAs as potential biomarkers for lung cancer, Ann. Oncol., 27 (2016). |
[20] | X. H. Wang, Y. Lu, J. J. Liang, J. X. Cao, Y. Q. Jin, G. S. An, et al., MiR-509-3-5p causes aberrant mitosis and anti-proliferative effect by suppression of PLK1 in human lung cancer A549 cells, Biochem. Biophy. Res. Commun., 478 (2016), 676-682. |
[21] | Y. Z. Wang, J. M. Li, H. M. Chen, Y. Mo, H. Ye, Y. Luo, et al., Down-regulation of miR-133a as a poor prognosticator in non-small cell lung cancer, Gene, 591 (2016), 333-337. |
[22] | F. Bray, J. Ferlay, I. Soerjomataram, L. A. Torre, A. Jemal, Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA Cancer J. Clin., 68 (2018), 394-424. |
[23] | M. Cao, W. Chen, Epidemiology of lung cancer in China, Thorac. Cancer, 10 (2019), 3-7. |
[24] | F. R. Vamos, J. Tovari, J. Fillinger, J. Timar, S. Paku, I. Kenessey, et al., Lymphangiogenesis correlates with lymph node metastasis, prognosis, and angiogenic phenotype in human non-small cell lung cancer, Clin. Cancer Res., 11 (2005), 7344-7353. |
[25] | S. L. Yu, H. Y. Chen, G. C. Chang, et al., MicroRNA signature predicts survival and relapse in lung cancer, Cancer Cell, 13 (2008), 48-57. |
[26] | S. Y. Sathipati, S. Y. Ho, Identifying the miRNA signature associated with survival time in patients with lung adenocarcinoma using miRNA expression profiles, Sci. Rep., 7 (2017). |
[27] | L. Fang, J. Cai, B. Chen, S. Wu, R, Li, X. Xu, et al., Aberrantly expressed miR-582-3p maintains lung cancer stem cell-like traits by activating Wnt/β-catenin signalling, Nat. Commun., 6 (2015), 8640. |
[28] | J. Liu, S. Liu, X. Deng, J. Rao, K. Huang, G. Xu, et al., MicroRNA-582-5p suppresses non-small cell lung cancer cells growth and invasion via downregulating NOTCH1, PLoS One, 14 (2019), e0217652. |
[29] | L. L. Wang, M. Zhang, miR-582-5p is a potential prognostic marker in human non-small cell lung cancer and functions as a tumor suppressor by targeting MAP3K2, Eur. Rev. Med. Pharmacol. Sci., 22 (2018), 7760-7767. |
[30] | K. Skrzypek, M. Tertil, S. Golda, M. Ciesla, K. Weglarczyk, G. Collet, et al., Interplay between heme oxygenase-1 and miR-378 affects non-small cell lung carcinoma growth, vascularization, and metastasis, Antioxid. Redox Signaling, 19 (2013), 644-660. |
[31] | H. Y. Lee, S. S. Han, H. Rhee, J. H. Park, J. S. Lee, Y. M. Oh, et al., Differential expression of microRNAs and their target genes in non-small-cell lung cancer, Mol. Med. Rep., 11 (2015), 2034-2040. |
[32] | H. Y. Lee, S. S. Han, S. Y. Song, Serum microRNAs as potential biomarkers for lung cancer, Ann. Oncol., 27 (2016). |
[33] | X. H. Wang, Y. Lu, J. J. Liang, J. X. Cao, Y. Q. Jin, G. S. An, et al., MiR-509-3-5p causes aberrant mitosis and anti-proliferative effect by suppression of PLK1 in human lung cancer A549 cells, Biochem. Biophys. Res. Commun., 478 (2016), 676-682. |
[34] | F. Xing, S. Sharma, Y. Liu, Y. Y. Mo, K. Wu, Y. Y. Zhang, et al., miR-509 suppresses brain metastasis of breast cancer cells by modulating RhoC and TNF-α, Oncogene, 34 (2015), 4890. |
[35] | Q. Zhai, L. Zhou, C. Zhao, J Wan, Z. Yu, X. Guo, et al., Identification of miR-508-3p and miR-509-3p that are associated with cell invasion and migration and involved in the apoptosis of renal cell carcinoma, Biochem. Biophys. Res. Commun., 419 (2012), 621-626. |
[36] | G. Korkmaz, C. Le Sage, K. A. Tekirdag, R. Agami, D. Gozuacik, miR-376b controls starvation and mTOR inhibition-related autophagy by targeting ATG4C and BECN1, Autophagy, 8 (2012), 165-176. |
[37] | D. Chen, W. Guo, Z. Qiu, Q. Wang, Y. Li, L. Liang, et al., MicroRNA-30d-5p inhibits tumour cell proliferation and motility by directly targeting CCNE2 in non-small cell lung cancer, Cancer Lett., 362 (2015), 208-217. |
[38] | Y. Li, P. Chen, L. Zu, B. Liu, M. Wang, Q. Zhou, MicroRNA-338-3p suppresses metastasis of lung cancer cells by targeting the EMT regulator Sox4, Am. J. Cancer Res., 6 (2016), 127-140. |
[39] | X. Chen, L. Wei, S. Zhao, miR-338 inhibits the metastasis of lung cancer by targeting integrin beta3, Oncol. Rep., 36 (2016), 1467-1474. |
[40] | P. Zhang, G. Shao, X. Lin, Y. Liu, Z. Yang, MiR-338-3p inhibits the growth and invasion of non-small cell lung cancer cells by targeting IRS2, Am. J. Cancer Res., 7 (2017), 53-63. |
[41] | J. Sui, S. Y. Xu, J. Han, S. R. Yang, C. Y. Li, L. H. Yin, et al., Integrated analysis of competing endogenous RNA network revealing lncRNAs as potential prognostic biomarkers in human lung squamous cell carcinoma, Oncotarget, 8 (2017), 65997-66018. |
[42] | X. Zhu, S. Ju, F. Yuan, G. Chen, Y. Shu, C. Li, et al., microRNA-664 enhances proliferation, migration and invasion of lung cancer cells, Exp. The.r Med., 13 (2017), 3555-3562. |
[43] | F. Yang, K. Wei, Z. Qin, W. Liu, C. Shao, C. Wang, et al., MiR-598 suppresses invasion and migration by negative regulation of derlin-1 and epithelial-mesenchymal transition in non-small cell lung cancer, Cell Physiol. Biochem., 47 (2018), 245-256. |
[44] | X. Tong, P. Su, H. Yang, F. Chi, L. Shen, X. Feng, et al., MicroRNA-598 inhibits the proliferation and invasion of non-small cell lung cancer cells by directly targeting ZEB2, Exp. Ther. Med., 16 (2018), 5417-5423. |
[45] | L. Xu, B. Wei, H. Hui, Y. Sun, Y. Liu, X. Yu, et al., Positive feedback loop of lncRNA LINC01296/miR-598/Twist1 promotes non-small cell lung cancer tumorigenesis, J. Cell Physiol., 234 (2019), 4563-4571. |
[46] | J. Kim, N. J. Lim, S. G. Jang, H. K. Kim, G. K. Lee, miR-592 and miR-552 can distinguish between primary lung adenocarcinoma and colorectal cancer metastases in the lung, Anticancer Res., 34 (2014), 2297-2302. |
[47] | J. Cao, X. R. Yan, T. Liu, X. B. Han, J. J. Yu, S. H. Liu, et al., MicroRNA-552 promotes tumor cell proliferation and migration by directly targeting DACH1 via the Wnt/beta-catenin signaling pathway in colorectal cancer, Oncol. Lett., 14 (2017), 3795-3802. |
[48] | N. Wang, W. Liu, Increased expression of miR-552 acts as a potential predictor biomarker for poor prognosis of colorectal cancer, Eur. Rev. Med. Pharmacol. Sci., 22 (2018), 412-416. |
[49] | M. Xu, Y. Z. Wang, miR133a suppresses cell proliferation, migration and invasion in human lung cancer by targeting MMP14, Oncol. Rep., 30 (2013), 1398-1404. |
[50] | L. K. Wang, T. H. Hsiao, T. M. Hong, H. Y. Chen, S. H. Kao, W. L. Wang, et al., MicroRNA-133a suppresses multiple oncogenic membrane receptors and cell invasion in non-small cell lung carcinoma, PLoS One, 9 (2014), e96765. |
[51] | Y. Wang, J. Li, H. Chen, Y. Mo, H. Ye, Y. Luo, et al., Down-regulation of miR-133a as a poor prognosticator in non-small cell lung cancer, Gene, 591 (2016), 333-337. |
[52] | D. Schmitt, L. M. Da Silva, W. Zhang, Z. Liu, R. Arora, S. Lim, et al., ErbB2-intronic microRNA-4728: A novel tumor suppressor and antagonist of oncogenic MAPK signaling, Cell Death Dis., 6 (2015), e1742. |
Disease severity | Datasets |
|
GSE196822 (n = 49) | GSE166424 (n = 38) | |
Healthy | 8 | 2 |
Asymptomatic COVID-19 | 8 | 30 |
Mild COVID-19 | 9 | 2 |
Moderate COVID-19 | 10 | 2 |
Severe COVID-19 | 7 | 2 |
Covid-bacterial coinfection | 5 | 0 |