Loading [Contrib]/a11y/accessibility-menu.js
Research article Special Issues

Prediction of atherosclerosis using machine learning based on operations research


  • Background 

    Atherosclerosis is one of the major reasons for cardiovascular disease including coronary heart disease, cerebral infarction and peripheral vascular disease. Atherosclerosis has no obvious symptoms in its early stages, so the key to the treatment of atherosclerosis is early intervention of risk factors. Machine learning methods have been used to predict atherosclerosis, but the presence of strong causal relationships between features can lead to extremely high levels of information redundancy, which can affect the effectiveness of prediction systems.

    Objective 

    We aim to combine statistical analysis and machine learning methods to reduce information redundancy and further improve the accuracy of disease diagnosis.

    Methods 

    We cleaned and collated the relevant data obtained from the retrospective study at Affiliated Hospital of Nanjing University of Chinese Medicine through data analysis. First, some features that with too many missing values are filtered out of the 34 features, leaving 25 features. 49% of the samples were categorized as the atherosclerosis risk group while the rest 51% as the control group without atherosclerosis risk under the guidance of relevant experts. We compared the prediction results of a single indicator that had been medically proven to be highly correlated with atherosclerosis with the prediction results of multiple features to fully demonstrate the effect of feature information redundancy on the prediction results. Then the features that could distinguish whether have atherosclerosis risk or not were retained by statistical tests, leaving 20 features. To reduce the information redundancy between features, after drawing inspiration from graph theory, machine learning combined with optimal correlation distances was then used to screen out 15 significant features, and the prediction models were evaluated under the 15 features. Finally, the information of the 5 screened-out non-significant features was fully utilized by ensemble learning to improve the prediction superiority for atherosclerosis.

    Results 

    Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), which is used to measure the predictive performance of the model, was 0.84035 and Kolmogorov-Smirnov (KS) value was 0.646. After feature selection model based on optimal correlation distance, the AUC value was 0.88268 and the KS value was 0.688, both of which were improved by about 0.04. Finally, after ensemble learning, the AUC value of the model was further improved by 0.01369 to 0.89637.

    Conclusions 

    The optimal distance feature screening model proposed in this paper improves the performance of atherosclerosis prediction models in terms of both prediction accuracy and AUC metrics. Code and models are available at https://github.com/Cesartwothousands/Prediction-of-Atherosclerosis.

    Citation: Zihan Chen, Minhui Yang, Yuhang Wen, Songyan Jiang, Wenjun Liu, Hui Huang. Prediction of atherosclerosis using machine learning based on operations research[J]. Mathematical Biosciences and Engineering, 2022, 19(5): 4892-4910. doi: 10.3934/mbe.2022229

    Related Papers:

    [1] Harem Othman Smail . The epigenetics of diabetes, obesity, overweight and cardiovascular disease. AIMS Genetics, 2019, 6(3): 36-45. doi: 10.3934/genet.2019.3.36
    [2] Sergio Branciamore, Andrei S. Rodin, Grigoriy Gogoshin, Arthur D. Riggs . Epigenetics and Evolution: Transposons and the Stochastic Epigenetic Modification Model. AIMS Genetics, 2015, 2(2): 148-162. doi: 10.3934/genet.2015.2.148
    [3] Janine E. Deakin, Renae Domaschenz, Pek Siew Lim, Tariq Ezaz, Sudha Rao . Comparative epigenomics: an emerging field with breakthrough potential to understand evolution of epigenetic regulation. AIMS Genetics, 2014, 1(1): 34-54. doi: 10.3934/genet.2014.1.34
    [4] Mrinalini Tiwari, Suhel Parvez, Paban K Agrawala . Role of some epigenetic factors in DNA damage response pathway. AIMS Genetics, 2017, 4(1): 69-83. doi: 10.3934/genet.2017.1.69
    [5] Asaad M Mahmood, Jim M Dunwell . Evidence for novel epigenetic marks within plants. AIMS Genetics, 2019, 6(4): 70-87. doi: 10.3934/genet.2019.4.70
    [6] Sraboni Chaudhury . Epigenetic regulation in Autism spectrum disorder. AIMS Genetics, 2016, 3(4): 292-299. doi: 10.3934/genet.2016.4.292
    [7] Shafagh A. Waters, Paul D. Waters . Imprinted X chromosome inactivation: evolution of mechanisms in distantly related mammals. AIMS Genetics, 2015, 2(2): 110-126. doi: 10.3934/genet.2015.2.110
    [8] Eun Jeong Kim, Yong-Ku Kim . Panic disorders: The role of genetics and epigenetics. AIMS Genetics, 2018, 5(3): 177-190. doi: 10.3934/genet.2018.3.177
    [9] Xiayao Diao . Histone glycation: Linking metabolic perturbation with epigenetic misregulation in cancer. AIMS Genetics, 2019, 6(2): 14-16. doi: 10.3934/genet.2019.2.14
    [10] Carlos García-Padilla, Amelia Aránega, Diego Franco . The role of long non-coding RNAs in cardiac development and disease. AIMS Genetics, 2018, 5(2): 124-140. doi: 10.3934/genet.2018.2.124
  • Background 

    Atherosclerosis is one of the major reasons for cardiovascular disease including coronary heart disease, cerebral infarction and peripheral vascular disease. Atherosclerosis has no obvious symptoms in its early stages, so the key to the treatment of atherosclerosis is early intervention of risk factors. Machine learning methods have been used to predict atherosclerosis, but the presence of strong causal relationships between features can lead to extremely high levels of information redundancy, which can affect the effectiveness of prediction systems.

    Objective 

    We aim to combine statistical analysis and machine learning methods to reduce information redundancy and further improve the accuracy of disease diagnosis.

    Methods 

    We cleaned and collated the relevant data obtained from the retrospective study at Affiliated Hospital of Nanjing University of Chinese Medicine through data analysis. First, some features that with too many missing values are filtered out of the 34 features, leaving 25 features. 49% of the samples were categorized as the atherosclerosis risk group while the rest 51% as the control group without atherosclerosis risk under the guidance of relevant experts. We compared the prediction results of a single indicator that had been medically proven to be highly correlated with atherosclerosis with the prediction results of multiple features to fully demonstrate the effect of feature information redundancy on the prediction results. Then the features that could distinguish whether have atherosclerosis risk or not were retained by statistical tests, leaving 20 features. To reduce the information redundancy between features, after drawing inspiration from graph theory, machine learning combined with optimal correlation distances was then used to screen out 15 significant features, and the prediction models were evaluated under the 15 features. Finally, the information of the 5 screened-out non-significant features was fully utilized by ensemble learning to improve the prediction superiority for atherosclerosis.

    Results 

    Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), which is used to measure the predictive performance of the model, was 0.84035 and Kolmogorov-Smirnov (KS) value was 0.646. After feature selection model based on optimal correlation distance, the AUC value was 0.88268 and the KS value was 0.688, both of which were improved by about 0.04. Finally, after ensemble learning, the AUC value of the model was further improved by 0.01369 to 0.89637.

    Conclusions 

    The optimal distance feature screening model proposed in this paper improves the performance of atherosclerosis prediction models in terms of both prediction accuracy and AUC metrics. Code and models are available at https://github.com/Cesartwothousands/Prediction-of-Atherosclerosis.



    1. Introduction

    Research in the field of epigenetics has taken off in the last decade as evidenced by the growing number of published literature and scientific meetings. This is obviously due to numerous findings of its critical role in diseases such as cancer, development and responses to environmental cues in a wide range of species. Epigenetics means in addition to or above genetics implying changes in gene expression without altering the DNA sequence. These changes are inherited from cell to cell and trans-generationally from parent to offspring. Such changes involve chemical modifications of the DNA such as methylation, histone post-translational modifications leading to chromatin modifications, remodeling and attachment to the nuclear matrix, packaging of DNA around nucleosomes and RNA mediated gene silencing. Epigenetic mediated modifications are usually influenced by environmental cues, including diet, physical stresses such as temperature, or chemicals such as toxins and can also be stochastic due to random effects. A striking example is seen in Agouti mice exposed to bisphenol A, a ubiquitous chemical found in our environment. These are genetically identical twins but have a different size and fur color. In slim healthy brown mice, Agouti gene is prevented from transcription by DNA methylation while in yellow obese mice which are prone to diabetes and cancer, the same gene is not methylated resulting in its expression [1,2]. This is a fine example of the trans-generational inheritance of an epigenetic state where the Agouti locus escaped the usual resetting of epigenetic states during reproduction.

    In the fruit fly Drosophila melanogaster, temperature treatment changes the eye color from white to red, and the treated individual flies pass on the change to their offspring over several generations without further requirement of temperature treatment [3]. The DNA sequence of the gene responsible for eye color remained the same for white eyed parents and red eyed offspring and the change was attributed to a specific histone modification [3]. Consistent with the work described above, a more recent study in Drosophila showed that the fission yeast homolog of activation transcription factor 2 (ATF2) that usually contributes to heterochromatin formation becomes phosphorylated leading to its release from heterochromatin upon heat shock or osmotic stress [4]. This new heterochromatin state that does not involve any DNA sequence change is transmitted over multiple generations [4].

    In an ecological context, variation of DNA methylation was observed in a wild population of Viola cazorlensis which is a perennial plant [5]. Using a modeling approach on data collected over many years, the authors have observed that epigenetic variation is significantly correlated with long-term differences in herbivory, but only weakly with herbivory-related DNA sequence variation suggesting that besides habitat, substrate and genetic variation, epigenetic variation may be an additional, and at least partly independent, factor influencing plant-herbivore interactions in the field [5].

    The above-discussed examples show a remarkable conservation of the function of epigenetic mechanisms in regulating gene expression among mammals, plants and invertebrates. This conservation goes beyond these species including early diverging single celled organisms such as microalgae. In this work, we will discuss how the study of non-model or emerging model organisms such as diatoms helps understand the evolutionary history of epigenetic mechanisms with a particular focus on DNA methylation and histone modifications.

    2. Diatoms, what are they?

    Diatoms are photosynthetic eukaryotic algae with cell sizes that usually range between 10 and 200 μm. They are found in all aquatic habitats including fresh and marine waters. These single celled species belong to Stramenopiles, which are part of the supergroup, Chromalveolates, containing also the Alveolata, the Haptophyta and Cryptophyceae (Figure 1, [6,7]). Diatoms are one of the most diverse and widespread phytoplankton with more than 100,000 extant species which are divided into two orders: centric that are round with radial symmetry and pennate that are elongate with bilateral symmetry (Figure 2). Fossil evidence suggests that diatoms originated during or before the early Jurassic period (~ 210-144 Mya). They are hypothesized to be derived from successive endosymbiosis where a heterotrophic eukaryotic host engulfed cells, phylogenetically close to red and green alga [8], combining therefore features from both green and red algae predecessors [9]. The diversity of diatoms increased further via the horizontal transfer of bacterial genes [10]. Diatoms and bacteria have indeed co-occurred in common habitats throughout the oceans for more than 200 million years, fostering interactions between these two diverse groups over evolutionary time scales [11]. Diatoms are at the base of the food web contributing to one fifth of the planet’s oxygen and representing 40% of primary marine productivity [12]. They therefore play a critical role sustaining life not only in the oceans but also on Earth as a whole through their role in the global carbon cycle. Diatoms are also important for human society, providing food through the aquatic food chain and high value compounds for cosmetic, pharmaceutical and industrial applications.

    Figure 1. Eukaryote phylogenetic tree. The tree is derived from different molecular phylogenetic and ultrastructural studies (adapted from [13]). Images courtesy of NCMA, the Culture Collection of Marine Phytoplankton at Bigelow Laboratory for Ocean Sciences, and for dinoflagellates (image courtesy of Richard Dorrell). Red arrow head points to diatoms.
    Figure 2. Light microscopy micrographs of representative model diatoms. A. The pennate diatom Phaeodactylum tricornutum. Scale bar 5 μm. B. The centric diatom Thalassiosira pseudonana. Scale bar 2 μm.

    Several diatom genome sequences are now available including the two centrics, Thalassiosira pseudonana (32 Mbp), (http://genome.jgi-psf.org/Thaps3/Thaps3.home.html) [14] and Thalassiosira oceanica (81.6 Mbp, [15]) Phaeodactylum tricornutum (27 Mbp) [10], (http://genome.jgi-psf.org/Phatr2/Phatr2.home.html) (Figure 2), the polar, cold-loving species Fragilariopsis cylindrus (80 Mbp; http://genome.jgi-psf.org/Fracy1/Fracy1.home.html), the toxigenic coastal species Pseudo-nitzschia multiseries (300 Mb; by the Joint Genome Institute) and the high lipid content diatom Fistulifera sp. strain JPCC DA058 [16]. The ecological success of diatoms suggests that they have developed sophisticated ways to cope with changing environments. Complete sequencing of P. tricornutum genome [17] showed that it has an unusual genetic composition, which arose through successive endosymbioses and horizontal gene transfers from bacteria. These events have provided diatoms with several unusual metabolic pathways, such as the urea cycle which was previously considered to exist only in animals [14,18]. The ability of diatoms to survive in rapidly changing environments with all the fluctuating conditions (UV radiations, temperature, salinity, toxins, nutrients, grazing pressure etc.) is also attributable to another layer of regulation known as epigenetics [19]. It was previously shown using McrBC, an enzyme sensitive to methylated DNA, that there is an induction of LTR-retrotransposon called Blackbeard (Bkb) with a decrease in cytosine methylation under nitrate limitation suggesting that nitrate depletion induces demethylation and upregulation of Bkb [20]. Although de novo insertion of Bkb was not shown in this study, its distribution with two other retrotransposons was analyzed in thirteen different accessions of P. tricornutum. The work showed clear differences in the distribution of the three retrotransposons among the tested accessions demonstrating their transposition in natural environments [20]. Besides this experimental clue of the occurrence of DNA methylation, our recent work [21,22,23,24] revealed an amazing conservation of the epigenetic machinery in this model diatom. P. tricornutum possesses histone modifying enzymes, small RNA [25,26] as well as DNA methylation which is absent in the multicellular brown algae Ectocarpus siliculosus which belongs to Stramenopiles [27].

    3. DNA methylation

    Cytosine DNA methylation is so far the best characterized epigenetic mark. It is a biochemical process in which a methyl group is added to the cytosine pyrimidine ring at position five (5meC) common to all three super kingdoms. Cytosine methylation is a conserved epigenetic mechanism crucial for a number of developmental processes such as regulation of imprinted genes, X-chromosome inactivation, silencing of repetitive elements including viral DNA and transposons and regulation of gene expression [28,29]. DNA methylation is widespread among protists, plants, fungi and animals [30,31]. It is however absent or poor in some species such as the budding yeast Saccharomyces cerevisiae, the fruit fly Drosophila melanogaster, the nematode worm Caenorhabditis elegans and the brown algae Ectocarpus siliculosus [27,32].

    With the advent of sequencing technologies and their increasing quality in terms of resolution and depth, our view and understanding of DNA methylation in the main supergroups of eukaryotes, plants and animals starts to emerge. The recently published methylome of P. tricornutum [23], which is phylogenetically distant from classic model organisms in the animal and green plant groups as well as diverse protists [31,33], drew a better picture and brought more insights into the evolutionary history of DNA methylation. With 27 Mb genome size,P. tricornutum shows a low level of DNA methylation compared to other eukaryotes such as human,Arabidopsis and the sea squirt Ciona intestinalis [31,33,34] (Figure 3). This is not correlated to the size of the genome as evidenced by the higher methylation occurrence of Ostreoccocus [33] that have much smaller genome and the low methylation in honey bee [31] whose genome is nearly ten times bigger than P. tricornutum. Although few species are compared in Figure 3, increase in cytosine DNA methylation seems to correlate with the average content of transposable elements, which presumably are kept silenced, and the complexity of the genome. Comparative epigenomics or methylomics provide some insights into the genes that might have impacted species evolutionary fate. A striking example are the differentially methylated genic regions (DMRs) found in human and its closely related primates such as chimpanzees, gorillas and orangutans which encode neurological functions suggesting species divergence correlated with developmental specialization [35,36]. In line with these observations, comparative epigenetic analysis of the two diatoms, the pennate P. tricornutum and the centric T. pseudonana [33], revealed no major differences in the fraction of the genome that is methylated or the context (Figure 4). However, out of 6199 shared genes, 408 are methylated only in P. tricornutum versus 461 only in T. pseudonana. DMRs between the two species are subsequently reflected in different GO categories enrichment [33] (Figure S1). Investigating further these genes might shed light on the history of their evolutionary divergence.

    Figure 3. DNA methylation in diverse Eukaryotes. Graphical representation of genome-wide percentages of cytosine DNA methylation as well as in different contexts [C (red), CG (green), CHG (orange) and CHH (black)]. Species names are represented on the Y-axis. All the stated elements are represented as stacks over gray bar indicating the size of each genome measured as mega base pairs (Mbp). For comparison, the human genome methylation data is given: genome size (3381,94), methylated Cs (75%). Data was taken from [33,37,38],
    http://genome.jgi-psf.org/,http://phytozome.jgi.doe.gov/pz/portal.html.
    Figure 4. The orthologous gene body cytosine methylation analysis. The genes that are differentially methylated between Phaeodactylum tricornutum (Pt) and Thalassiosira pseudonana (Tp) are represented. Qualitative analysis of gene body cytosine methylation on the orthologous genes between Pt and Tp genome. Using reciprocal best-hit BLAST approach, orthologous genes between Pt and Tp genomes are found. Out of 6199 orthologues, 459 genes are methylated in Pt whereas 512 genes are found methylated in Tp genome. The Venn comparison of these genes shows the conservation of gene body cytosine methylation over 51 genes while 408 and 461 genes are specifically methylated in Pt and Tp genomes, respectively. SRA accessions: Tp = GSM1134628; Pt = GSM1134626.

    DNA methylation can occur in different contexts including CG, CHG and CHH where H can be any nucleotide except G. In P. tricornutum, DNA methylation was found in all contexts suggesting that CHG and CHH is not a plant innovation but existed already in a common ancestor and was lost from certain lineages. Indeed, Eukaryotes have evolved and/or retained different DNA methyltransferase complements responsible for the different context of methylation. Metazoans commonly encode DNMT1 and DNMT3 proteins, while higher plants additionally have plant-specific chromomethylase (CMT). On the other hand, fungi have DNMT1, Dim-2, DNMT4, and DNMT5 [39,40]. Previous phylogenetic analysis suggests that P. tricornutum genome encodes a peculiar set of DNMTs as compared to other eukaryotes [41]. DNMT1 appears to be absent in P. tricornutum as well as putative proteins coding for plant specific DNA methyltransferase CMT3 and DRM, which are responsible for non CG methylation. P. tricornutum encodes DNMT2 (Pt16674), which is an RNA methyltransferase that shows strong sequence similarities with DNA cytosine C5 methyltransferases. In addition to DNMT3 (Pt 46156), diatom genomes also encode DNMT5 (Pt45072) and DNMT6 (Pt36049) proteins as well as a bacterial-like DNMT (Pt47357) [41]. In bacteria, cytosine methylation acts in the restriction-modification system. Thus, the function of a bacterial-like DNMT in P. tricornutum is unclear. Interestingly, it is conserved in the centric diatom T. pseudonana (Tp 2094), from which pennate diatoms such as P. tricornutum diverged ~ 90 million years ago. This implies that a diatom common ancestor acquired DNMT from bacteria after a horizontal gene transfer prior to the centric/pennate diatom split [42]. Conservation of this gene in diatoms over this length of time suggests that it is functional. Because DNMT5 is also found in other algae and fungi, we postulate that it was present in a common ancestor. Furthermore, structural, functional, and phylogenetic data suggest that CMT, Dim-2 and DNMT1 are monophyletic [39,40]. Therefore, we propose that the common ancestor of plants, unikonts and stramenopiles possessed DNMT1 (subsequently lost in diatoms), DNMT3, and probably also DNMT5 (lost in metazoans and higher plants). This evolutionarily important loss is supported by the absence of DNA methyltransferases in the stramenopile E. siliculosus [27]. P. tricornutum encodes three putative DNA demethylases (Pt46865, Pt48620, Pt12645) with ENDO domain similar to the Arabidopsis DNA demethylases ROS1 domain suggesting similar mechanisms for DNA demethylation.

    Dnmt5 was reported in a wide range of Eukaryotic single celled species that lack Dnmt1 but nevertheless retain CG methylation which was shown to be catalyzed by Dnmt5 [33]. In this work, the authors used Cryptococcus neoformans that has Dnmt5 as a unique DNA methylatransferase and showed that CG methylation is entirely lost when DNMT5 is deleted [33]. However, the authors did not exclude that another unknown methyltransferase catalyzes CG methylation and uses Dnmt5 as a required accessory or regulatory protein [33]. As mentioned above, typical Dnmt1 does not exist in P. tricornutum but our in-silico analysis revealed the presence of a gene which seems to be a Dnmt1 remnant protein which lacks the C5 methyltransferase catalytic domain but has retained two motifs characteristic of Dnmt1, the Bromo-adjacent homology (BAH) domain and a cysteine rich region (ZF_CXXX) that binds zinc ions. In higher Eukaryotes, Dnmt1 is the enzyme that catalyzes CG methylation and the activity of its catalytic domain is regulated by the N terminal region of the protein. Indeed an isolated Dnmt1 catalytic domain was proven to be inactive [43,44]. Interestingly, both BAH and cysteine rich domains are found within the N terminal region of Dnmt1 in higher eukaryotes. A tempting hypothesis would be that P. tricornutum Dnmt1-like is the accessory protein that might interact with Dnmt5 to catalyze CG methylation. It is tempting to think that these two domains that are as independent proteins in P. tricornutum fused through evolutionary time in a single polypeptide protein in higher Eukaryotes and gave rise to the eukaryotic Dnmt1. We are currently using a reverse genetic approach to determine the function of Dnmts and the putative accessory protein in P. tricornutum. The work will help to better understand their role in processes such as maintenance and de novo DNA methylation as well as context specificities which will ultimately shed light on the function of DNMTs in an evolutionary context.

    P. tricornutum methylome discussed in various studies [23,30,31] confirms the conservation of gene body methylation as an ancient feature and its methylation preference for exons over introns in all Eukaryotic genomes where it has been examined including Arabidopsis, Ciona intestinalis, honey-bee and human. Several hypotheses were made to explain this specific pattern and interestingly, in-silico analysis of P. tricornutum genome revealed few evidences that support them. P. tricornutum encodes ROS1 related glycolysases that were thought present only in Arabidopsis where they were shown to specifically remove DNA methylation from gene ends [45]. A more universal factor that might explain gene body methylation pattern is the histone mark H3K4me that antagonizes DNA methylation and is distributed around the transcription start site in the genomes where it has been examined. In P. tricornutum, H3K4me2 does not localize with DNA methylation and maps around the translation start site [24], which is in line with its potential contribution to DNA methylation pattern at gene bodies.

    A conserved function for gene-body methylation at the whole-genome level has not yet been established. When examined, sets of body-methylated genes were found to be expressed constitutively at moderate levels such as in angiosperms and most invertebrates [34,46,47,48]. Nevertheless, in the silkworm, gene-body methylation correlates positively with gene expression levels [49]. In human, gene body methylation was shown to be involved in X chromosome activation [50] while it was recently reported that methylation of the first exon of autosomal genes correlates with transcriptional silencing [51]. It was also proposed that gene body methylation in human regulates the activity of intragenic alternative promoters [52]. In this line, a recent study [53] has established that body-methylated genes in A. thaliana are functionally more important, as measured by phenotypic effects of insertional mutants, than unmethylated genes. Using a probabilistic approach, the authors have reanalyzed single-base resolution bisulfite sequence data from A. thaliana. They demonstrated that body methylated genes are likely involved in either suppressing expression from cryptic promoters within coding regions and/or in enhancing accurate splicing of primary transcripts [53]. Interestingly, these functions were already proposed by previous studies [54,55,56], and the recent comparative study of honey-bee methylome has also established a link between gene-body methylation and splicing [57]. In our study, we found that gene-body methylation in P. tricornutum correlates positively with gene length and exon number. It is thus tempting to infer that intragenic methylation in P. tricornutum may play a role in avoiding aberrant transcription and/or mis-splicing. Furthermore, functional annotation of body-methylated genes reveals the presence of important functional classes such as (1) transferases and catalytic enzymes that play important role in cell wall assembly and its rearrangement which is crucial for cell integrity, (2) hydrolase activity which is important in stress responses, and (3) transporter activity necessary for metabolites shuttling such as silicic acid. Considering previous studies and in light of our recent work in P. tricornutum, gene body methylation does not suppress expression but rather correlates with low to moderate transcriptional activity. This might have the putative function of preventing aberrant transcription from intragenic promoters and appears to be a common and ancestral eukaryotic feature as reported previously [31,54].

    4. Histones and their modifications

    Eukaryotic chromosomes are packaged in the nucleus by wrapping the DNA around an octamer of four core histone proteins H2A, H2B, H3 and H4 forming the basic unit of chromatin, the nucleosome. Further compaction is achieved by the interaction of the nucleosome to the linker histone H1. This phenomenon seems to be conserved among all Eukaryotes and even archaea, where the nucleosomes are formed of only a tetramer of two H3 and H4 histones found in the cell, as archaea do not have a nucleus. Furthermore, nucleosome occupancy was found similar in two species of Archaea with depletion over transcriptional start sites as well as a conservation of nucleosome positioning code [58,59]. This demonstration of similarities between Eukaryotes and Archaea chromatin, suggests that histones and chromatin architecture evolved before the divergence of Archaea and Eukarya. This also suggests that the initial function of nucleosomes and chromatin formation might have been for the regulation of gene expression rather than the packaging of DNA, which is an Eukaryotic invention [58].

    Histones are subject to a variety of post-translational modifications (PTMs) that have an important role in several processes such as transcription, replication and DNA repair. Histone PTMs in particular at the N terminus include acetylation, methylation, phosphorylation and ubiquitination, which were extensively studied in diverse species, along with modifications like sumoylation, glycosylation, biotinylation, carbonylation, and ADP ribosylation for which little is known [60]. Histone PTMs function either by altering the accessibility of genes to the transcriptional machinery, or by binding to effector proteins via specialized chromatin domains that deposit or erase these histone modifications. PTMs function in a combinatorial pattern known as the histone code, which confers active or repressive chromatin states to specific chromosomal regions of the genome [60,61].

    P. tricornutum possesses 14 histone genes encoding 9 histone proteins. They are dispersed throughout five chromosomes with most in clusters of two to six genes as seen for most Eukaryotes. P. tricornutum histones belong to the five known classes, histone H1, H3, H4, H2A and H2B. These histones are conserved among diatoms and eukaryotic species. With the exception of histones H4 and H2B, P. tricornutum encodes variants for each histone H1, H3 and H2A. Sequence alignment of histone H3 shows the presence of canonical and replacement histones similar to human, H3.2 and H3.3. Additionally, P. tricornutum expresses a centromere specific variant commonly called CenH3 that varies considerably from the rest of H3 histones especially in the N terminal tail. CenH3 is essential for recruitment of kinetochores components ensuring correct segregation of chromosomes during mitosis and meiosis [62].

    H2A histone members constitute the most diverse group of histones with the greatest number of variants. P. tricornutum is no exception as it encodes two copies of the canonical H2A but also both H2AZ (Pt28445) and H2AX variants while this latter is missing from C. elegans and protozoan parasites such as Plasmodium and Trypanosomes. The presence of the conserved motif SQE/D in the C terminal of P. tricornutum H2AX suggests a putative role of this histone in the maintenance of genome integrity via its contribution in the repair of double stranded DNA breaks. P. tricornutum encodes two histone H1 variants, which share nearly 50% identity. Interestingly, one of them (Pt44318), is expressed only in stress conditions such as high light which suggests its putative role in DNA repair as found previously in yeast and vertebrates [63,64]. The diversity of histone variants in P. tricornutum is interesting and suggests an adaptive evolution to the life history of diatoms via their chromatin interface to acquire new abilities to cope with the changing environment.

    P. tricornutum and T. pseudonana genome sequencing revealed a long list of histone modifying and demodifying enzymes that are summarized in Table 1. This shows the great conservation of the writers and erasers of histone modification marks in diatoms and their ancient origin. Furthermore, Mass spectrometry analysis (MS) of PTMs in P. tricornutum showed similarities to that of plants and mammals including acetylation and/or methylation of several lysines on the N terminal tail of histones H2A, H2B, H3 and H4 and mono, di and tri-methylation of lysines 4, 9, 27 and 36 of histone H3 suggesting the early divergence of these PTMs and their important role in transcriptional regulation of many biological processes (Table 2). Interestingly, P. tricornutum combines histone PTMs found in both mammals and plants such as acetylation and mono-di methylation of lysine 79 of histone H3 found only in human and yeast [65] but not in Arabidopsis [66] underlying P. tricornutum genome diversity and the divergence of histone modifications among species throughout evolution. Another interesting example is the acetylation of lysine 20 of histone H4 which is shared with Arabidopsis but different from human where the residue is only methylated [66]. H4K20me which is known to be a repressive mark was detected neither by mass spectrometry nor by western blot using an antibody that recognizes this modification in Arabidopsis (data not shown). Furthermore, mono and dimethylation of lysine 79 of histone H4 are modifications that P. tricornutum shares only with Toxoplasma gondii which is an obligate intracellular parasitic protozoan belonging to Alveolates, a superphylum closely related to Stramenopiles [24]. A non-exhaustive mass spectrometry analysis of histones from an early diverging diatom Thalassiosira pseudonana shows the presence of similar histone PTMs (Figure 5), which points to the important role that histone PTMs might have had in shaping diatom genomes and ultimately in the diversification of eukaryotes.

    Table 1. Histone modifications enzymes in two diatom species. Proteins encoding putative enzymes responsible for histone modification which are identified in P. tricornutum and T. pseudonana. New gene models are given for P. tricornutum (http://protists.ensembl.org/Phaeodactylum_tricornutum/Location/Genome).
    Histone Modifiers Residues Modified Homologs in P. tricornutum (Phatr2) Homologs in P. tricornutum (Phatr3) Homologs in T. pseudonana
    Lysine Acetyltransferases (KATs)
    HAT1 (KAT1) H4 (K5, K12) 54343 Phatr3_J54343 1397, 22580
    GCN5 (KAT2) H3 (K9, K14, K18, K23, K36) 46915 Phatr3_J2957 15161
    Nejire (KAT3); CBP/p300 (KAT3A/B) H3 (K14, K18, K56) H4 (K5, K8); H2A (K5) H2B (K12, K15) 45703, 45764, 54505 Phatr3_J45703, Phatr3_J45764, Phatr3_J54505 24331, 269496, 263785
    MYST1 (KAT8) H4 (K16) 24733, 24393 Phatr3_J51406, Phatr3_J3062 37928, 36275
    ELP3 (KAT9) H3 50848 Phatr3_J50848 9040
    Unknown
    RPD3 (Class I HDACS) H2, H3, H4 51026, 49800 Phatr3_J51026, Phatr3_J49800 41025, 32098, 261393
    HDA1 (Class II HDACS) H2, H3, H4 45906, 50482, 35869 Phatr3_J45906, Phatr3_J50482, Phatr3_J35869 268655, 269060, 3235, 15819
    NAD+ dependent (Class III HDACS) H4 (K16) 52135, 45850, 24866, 45909, 52718, 21543, 39523 Phatr3_J52135, Phatr3_J45850, Phatr3_J8827, Phatr3_J12305, Phatr3_J16589, Phatr3_J21543, Phatr3_J39523 269475, 264809, 16405, 35693, 264494, 16384, 35956
    Lysine Methyltransferases
    MLL H3 (K4) 40183, 54436, 42693, 47328, 49473, 49476, 44935 Phatr3_EG00277, Phatr3_EG02316, Phatr3_J6915, Phatr3_J47328, Phatr3_EG00277, Phatr3_15913, Phatr3_J44935 35182, 35531, 22757
    ASH1/WHSC1 H3 (K4) 43275 Phatr3_6093 264323
    SETD1 H3 (K36), H4 (K20) not found not found not found
    SETD2 H3 (K36) 50375 Phatr3_EG02211 35510
    SETDB1 H3 (K9) not found not found not found
    SETMAR H3 (K4, K36) not found not found not found
    SMYD H3 (K4) bd1647, 43708 Phatr3_J1647, Phatr3_J43708 23831, 24988
    TRX-related not found not found not found
    E(Z) H3 (K9, K27) 32817 Phatr3_J6698 268872
    EHMT2 H3 (K9, K27) not found not found not found
    SET+JmjC Unknown bd1647 Phatr3_J1647 not found
    Lysine Demethylases (KDM)
    LSD1 (KDM1) H3 (K4, K9) 51708, 44106, 48603 Phatr3_J51708, Phatr3_J44106, Phatr3_J48603 not found
    FBXL (KDM2) H3 (K36) 42595 Phatr3_J42595 not found
    JMJD2 (KDM4)/JARID H3 (K9, K36) 48747 Phatr3_J48747 2137
    JMJ-MBT Unknown 48109 Phatr3_J48109 22122
    JMJ-CHROMO Unknown 40322 Phatr3_J40322 1863
     | Show Table
    DownLoad: CSV
    Table 2. Diversity of histone PTMs in P. tricornutum. Examples of PTMs of histones present in P. tricornutum but absent or not detected (ND) in representative of two major lineages, animals and plants. Data taken from [24,66,67].
    Histone PTM P. tricornutum H. sapiens A. thaliana
    H4K31 present ND ND
    H4K59Ac present ND ND
    H4K59me present ND ND
    H4K79me present ND ND
    H4K79me2 present ND ND
    H4K20Ac present ND present
    H4K20me present present ND
    H3K79me present present ND
    H3K79me2 present present ND
    H2BK107Ac present ND ND
     | Show Table
    DownLoad: CSV
    Figure 5. Histone PTMs in T. pseudonana. Diagram showing sites of PTMs of core and variant histones identified in Thalassiosira pseudonana by mass spectrometry. Amino acid residue number is indicated below the peptide sequence. Dark gray, black and light gray boxes indicate N-terminal, globular core and C-terminal domains, respectively. Acetylation and methylation are indicated in green and red respectively.

    5. Non-coding RNA

    Non-coding RNA is found in all kingdoms of life with fractions varying from 8% for bacteria to more than 98% for human genome (Figure 6). This non-coding fraction comprises functional non-coding RNAs such as transfer, ribosomal and regulatory RNAs as well as DNA that remains untranscribed or gives rise to RNA molecules of unknown function. Genome size correlates positively with the amount of non-coding DNA and evolutionary age of the species suggesting that the smaller and early diverging the species are, the less non-coding fraction of their genome they have (Figure 6). This also suggests that non-coding RNAs arose with the complexity of species and the plethora of subsequent novel functions. Although initially argued to be spurious transcriptional noise or accumulated evolutionary debris arising from the early assembly of genes and/or the insertion of mobile genetic elements, we have now evidence suggesting that the previously named “junk DNA” may play a major biological role in cellular development, physiology and pathologies [68]. It is also argued that not all of it will be functional as the transcription machinery is not perfect and will generate non-coding RNA with no fitness advantage and simply tolerating them would be more feasible than evolving and maintaining more rigorous control mechanisms that could prevent their production [69]. Non-coding RNAs that appear to have an epigenetic function including heterochromatin formation, DNA methylation, histone modifications and transcriptional silencing can be divided into two main categories based on their length: short non-coding RNAs (<30 nts) and long non-coding RNAs (>200 nts). Short interfering RNAs (siRNA) of 21 nucleotides are produced by long double stranded RNA through a cleavage by the endonuclease Dicer and are bound by an Argonaute protein. They recognize and silence their target mRNAs by perfect sequence complementarity which is in contrast to micro RNAs (miRNAs, 20 to 23 nts) which silence their target sequences by incomplete homology and act primarily at the translational level. Long non-coding RNAs (lncRNAs) have been reported in several eukaryotic genomes including mouse [70], human [71], Arabidopsis [72] and Zebrafish [73].

    Figure 6. The percentage of coding fraction of several Eukaryotic and bacterial genomes (Adapted from [68]).

    Non-coding RNAs are highly diverse and new classes are constantly being discovered. For an exhaustive list of known non-coding RNAs, refer to [74]. Non-coding RNA are known to occur in a wide range of species including human, insects, fish, plants, yeast, protists, even bacteria and archaea, underlying a conserved phenomenon. In Chlamydomoans reinhardtii, two studies reported the existence of miRNA that are reminiscent of the miRNAs of multicellular organisms as well as the phased transacting siRNAs (tasiRNAs) of plants. Chlamydomonas miRNA do not seem to have sequence homology to any known miRNAs in animals or plants, suggesting that miRNA genes may have evolved independently in the lineages leading to animals, plants and green algae [75,76]. The discovery of small RNA in diatoms and cocolithophores further confirmed the early divergence of such molecules [25,77,78].

    6. Conclusions and future perspectives

    Although epigenetics is recognized for its fundamental role in diseases such as cancer, there is still a long way to go before we appreciate its importance in shaping species genomes through evolutionary time scales. Epigenetics allows individuals and populations to cope with biotic and abiotic stresses and respond to environmental cues through its dynamic regulation of genes but also provides progenies with a better fitness when the parents experience a particular stress affecting therefore their evolutionary potential. This is exemplified by DNA methylation that acts as an inducer of mutations in DNA sequences via the deamination process impacting therefore genome nucleotide sequences. These mutations in chromosomal DNA might have an effect on the fitness and evolution of individuals and populations. Using model or non-model single celled eukaryotes such as diatoms which constitute an early diverging branch in the evolutionary tree will provide a solid complement to multicellular organisms to enhance our understanding of the impact and true contribution of epigenetics to biological processes and ultimately to their evolutionary history. It is becoming clear now that it is important to include epigenetics and its impact on the evolutionary biology of species in our way of thinking and designing of experiments in biology.

    Acknowledgments

    AR is a PhD student funded by the MEMO LIFE International PhD program.

    Conflict of Interest

    All authors declare no conflicts of interest in this paper.

    Supplementary

    Figure S1. Gene Ontology (GO) enrichment analysis based on semantic clustering of molecular function (MF) associated to P. tricornutum-T. pseudonana orthologous genes which are (A) methylated only in P. tricornutum and (B) methylated in T. pseudonana. X and the Y axis represent the pairwise semantic similarity scores. Color in the sphere represents the uniqueness of each term when compared semantically to the whole list of molecular functions. More unique terms tends to be less dispensable. The graph was generated using Revigo [79].


    [1] C. Sinning, A. Kieback, P. S. Wild, R. B. Schnabel, F. Ojeda, S. Appelbaum, et al., Association of multiple biomarkers and classical risk factors with early carotid atherosclerosis: results from the Gutenberg Health Study, Clin. Res. Cardiol., 103 (2014), 477-485. https://doi.org/10.1007/s00392-014-0674-6 doi: 10.1007/s00392-014-0674-6
    [2] J. F. Polak, M. J. Pencina, D. H. O'Leary, R. B. D'Agostino, Common carotid artery intima-media thickness progression as a predictor of stroke in multi-ethnic study of atherosclerosis, Stroke, 42 (2011), 3017-3021. https://doi.org/10.1161/STROKEAHA.111.625186
    [3] M. W. Lorenz, C. Schaefer, H. Steinmetz, M. Sitzer, Is carotid intima media thickness useful for individual prediction of cardiovascular risk? Ten-year results from the Carotid Atherosclerosis Progression Study (CAPS), Eur. Heart J., 31 (2010), 2041-2048. https://doi.org/10.1093/eurheartj/ehq189 doi: 10.1093/eurheartj/ehq189
    [4] M. Soni, M. Ambrosino, D. S. Jacoby, The use of subclinical atherosclerosis imaging to guide preventive cardiology management, Curr. Cardiol. Rep., 23 (2021), 61. https://doi.org/10.1007/s11886-021-01490-7
    [5] A. Hazra, S. K. Mandal, A. Gupta, A. Mukherjee, A. Mukherjee, Heart disease diagnosis and prediction using machine learning and data mining techniques: a review, Adv. Comput. Sci. Technol., 10 (2017), 2137-2159.
    [6] M. Shouman, T. Turner, R. Stocker, Integrating Naive Bayes and K-means clustering with different initial centroid selection methods in the diagnosis of heart disease patients, Comput. Sci. Conf. Proc., 5 (2012), 125-137. https://doi.org/10.5121/csit.2012.2511 doi: 10.5121/csit.2012.2511
    [7] O. Terrada, B. Cherradi, A. Raihani, O. Bouattane, Classification and prediction of atherosclerosis diseases using machine learning algorithms, in International Conference on Optimization and Applications (ICOA), 5 (2019), 1-5. https://doi.org/10.1109/ICOA.2019.8727688
    [8] D. Han, K. K. Kolli, S. J. Al'Aref, L. Baskaran, A. R. van Rosendael, H. Gransar, et al., Machine learning framework to identify individuals at risk of rapid progression of coronary atherosclerosis: from the PARADIGM registry, J. Am. Heart Assoc., 9 (2020), e013958. https://doi.org/10.1161/JAHA.119.013958
    [9] O. Couturier, H. Delalin, H. Fu, G. Edouard, A three-step approach for stulong database analysis: characterization of patients groups, in Proceeding of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, 2004.
    [10] M. Abdar, W. Ksiazek, U. R. Acharya, R. S. Tan, V. Makarenkov, P. Plawiak, A new machine learning technique for an accurate diagnosis of coronary artery disease, Comput. Methods Programs Biomed., 179 (2019), 104992. https://doi.org/10.1016/j.cmpb.2019.104992 doi: 10.1016/j.cmpb.2019.104992
    [11] V. S. H. Rao, M. N. Kumar, Novel approaches for predicting risk factors of atherosclerosis, IEEE J. Biomed. Health, 17 (2012), 183-189. https://doi.org/10.1109/TITB.2012.2227271 doi: 10.1109/TITB.2012.2227271
    [12] J. Xie, R. Wu, H. Wang, Y. Kong, H. Li, W. Zhang, A novel weight learning approach based on density for accurate prediction of atherosclerosis, in Intelligent Computing Theories and Application (eds. D. S. Huang, K. H. Jo., Z. K. Huang), Springer, (2019), 190-200. https://doi.org/10.1007/978-3-030-26969-2_18
    [13] W. He, Y. Xie, H. Lu, M. Wang, H. Chen, Predicting coronary atherosclerotic heart disease: an extreme learning machine with improved salp swarm algorithm, Symmetry, 12 (2020), 1651. https://doi.org/10.3390/sym12101651 doi: 10.3390/sym12101651
    [14] A. Ward, A. Sarraju, S. Chung, J. Li, R. Harrington, P. Heidenreich, Machine learning and atherosclerotic cardiovascular disease risk prediction in a multi-ethnic population, NPJ Digit. Med., 125 (2020), 1-7. https://doi.org/10.1038/s41746-020-00331-1 doi: 10.1038/s41746-020-00331-1
    [15] S. Nikan, F. Gwadry-Sridhar, M. Bauer, Machine learning application to predict the risk of coronary artery atherosclerosis, in 2016 International Conference on Computational Science and Computational Intelligence (CSCI), (2016), 34-39. https://doi.org/10.1109/CSCI.2016.0014
    [16] J. Xie, H. Wang, J. Zhang, C. Meng, Y Kong, S. Mao, et al., A novel hybrid subset-learning method for predicting risk factors of atherosclerosis, in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), (2017), 2124-2131. https://doi.org/10.1109/BIBM.2017.8217987
    [17] M. Priya, P. Ranjith Kumar, A novel intelligent approach for predicting atherosclerotic individuals from big data for healthcare, Int. J. Prod. Res., 53 (2015), 7517-7532. https://doi.org/10.1080/00207543.2015.1087655 doi: 10.1080/00207543.2015.1087655
    [18] A. I. Sakellarios, V. C. Pezoulas, C. Bourantas, K. K. Naka, L. K. Michalis, P. W. Serruys, et al., Prediction of atherosclerotic disease progression combining computational modelling with machine learning, in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), (2020), 2760-2763. https://doi.org/10.1109/EMBC44109.2020.9176435
    [19] B. Kumar, H. Mathur, Comprehensive analysis of atherosclerosis disease prediction using machine learning, Ann. Rom. Soc. Cell Biol., 4 (2021), 17962-17975.
    [20] M. Lin, H. Cui, W. Chen, A. van Engelen, M. de Bruijne, M. R. Azarpazhooh, et al., Longitudinal assessment of carotid plaque texture in three-dimensional ultrasound images based on semi-supervised graph-based dimensionality reduction and feature selection, Comput. Biol. Med., 116 (2020), 103586. https://doi.org/10.1016/j.compbiomed.2019.103586 doi: 10.1016/j.compbiomed.2019.103586
    [21] Q. A. Hathaway, N. Yanamala, M. J. Budoff, P. P. Sengupta, I. Zeb, Deep neural survival networks for cardiovascular risk prediction: The Multi-Ethnic Study of Atherosclerosis (MESA), Comput. Biol. Med., 139 (2021), 104983. https://doi.org/10.1016/j.compbiomed.2021.104983 doi: 10.1016/j.compbiomed.2021.104983
    [22] A. D. Jamthikar, D. Gupta, L. Saba, N. N. Khanna, K. Viskovic, S. Mavrogeni, et al., Artificial intelligence framework for predictive cardiovascular and stroke risk assessment models: A narrative review of integrated approaches using carotid ultrasound, Comput. Biol. Med., 126 (2020), 104043. https://doi.org/10.1016/j.compbiomed.2020.104043 doi: 10.1016/j.compbiomed.2020.104043
    [23] S. S. Skandha, S. K. Gupta, L. Saba, V. K. Koppula, A. M. Johri, N. N. Khanna, et al., 3-D optimized classification and characterization artificial intelligence paradigm for cardiovascular/stroke risk stratification using carotid ultrasound-based delineated plaque: AtheromaticTM 2.0, Comput. Biol. Med., 125 (2020), 103958. https://doi.org/10.1016/j.compbiomed.2020.103958 doi: 10.1016/j.compbiomed.2020.103958
    [24] R. H. Lopes, I. D. Reid, P. R. Hobson, The two-dimensional Kolmogorov-Smirnov test, Prod. Sci., (2007), 1-12.
    [25] G. Biau, E.Scornet, A random forest guided tour, Test, 25 (2016), 197-227. https://doi.org/10.1007/s11749-016-0481-7
    [26] M. Noto, H. Sato, A method for the shortest path search by extended Dijkstra algorithm, in Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions', 3 (2000), 2316-2320. https://doi.org/10.1109/ICSMC.2000.886462
    [27] N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers, Mach. Learn., 29 (1997), 131-163. https://doi.org/10.1023/A:1007465528199
    [28] F. Xu, J. Zhang, X. Zhou, H. Hao, Lipoxin A4 and its analog attenuate high fat diet-induced atherosclerosis via Keap1/Nrf2 pathway, Exp. Cell Res., 412 (2022), 113025. https://doi.org/10.1016/j.yexcr.2022.113025 doi: 10.1016/j.yexcr.2022.113025
    [29] F. Polak, J. Y. C. Backlund, M. Budoff, P. Raskin, I. Bebu, J. M. Lachin, et al., Coronary artery disease events and carotid intima-media thickness in Type 1 diabetes in the DCCT/EDIC cohort, J. Am. Heart Assoc., 24 (2021), e022922. https://doi.org/10.1161/JAHA.121.022922
  • This article has been cited by:

    1. Yue Wu, Leila Tirichine, Chromosome-Wide Distribution and Characterization of H3K36me3 and H3K27Ac in the Marine Model Diatom Phaeodactylum tricornutum, 2023, 12, 2223-7747, 2852, 10.3390/plants12152852
    2. Nodumo Nokulunga Zulu, Krzysztof Zienkiewicz, Katharina Vollheyde, Ivo Feussner, Current trends to comprehend lipid metabolism in diatoms, 2018, 70, 01637827, 1, 10.1016/j.plipres.2018.03.001
    3. Mhammad Zarif, Ellyn Rousselot, Bruno Jesus, Leïla Tirichine, Céline Duc, H3K27me3 and EZH Are Involved in the Control of the Heat-Stress-Elicited Morphological Changes in Diatoms, 2024, 25, 1422-0067, 8373, 10.3390/ijms25158373
    4. Achal Rastogi, Uma Maheswari, Richard G. Dorrell, Fabio Rocha Jimenez Vieira, Florian Maumus, Adam Kustka, James McCarthy, Andy E. Allen, Paul Kersey, Chris Bowler, Leila Tirichine, Integrative analysis of large scale transcriptome data draws a comprehensive landscape of Phaeodactylum tricornutum genome and evolutionary origin of diatoms, 2018, 8, 2045-2322, 10.1038/s41598-018-23106-x
    5. Emilia Grypioti, Hugues Richard, Nikoleta Kryovrysanaki, Marianne Jaubert, Angela Falciatore, Frédéric Verret, Kriton Kalantidis, Dicer‐dependent heterochromatic small RNAs in the model diatom species Phaeodactylum tricornutum, 2024, 241, 0028-646X, 811, 10.1111/nph.19429
    6. Katherine L. Moran, Yelyzaveta Shlyakhtina, Maximiliano M. Portal, The role of non-genetic information in evolutionary frameworks, 2021, 56, 1040-9238, 255, 10.1080/10409238.2021.1908949
    7. Leila Tirichine, Achal Rastogi, Chris Bowler, Recent progress in diatom genomics and epigenomics, 2017, 36, 13695266, 46, 10.1016/j.pbi.2017.02.001
    8. Yue Wu, Timothée Chaumier, Eric Manirakiza, Alaguraj Veluchamy, Leila Tirichine, PhaeoEpiView: an epigenome browser of the newly assembled genome of the model diatom Phaeodactylum tricornutum, 2023, 13, 2045-2322, 10.1038/s41598-023-35403-1
    9. Agnes K M Weiner, Mario A Cerón-Romero, Ying Yan, Laura A Katz, John Archibald, Phylogenomics of the Epigenetic Toolkit Reveals Punctate Retention of Genes across Eukaryotes, 2020, 12, 1759-6653, 2196, 10.1093/gbe/evaa198
    10. Jesse C. Traller, Shawn J. Cokus, David A. Lopez, Olga Gaidarenko, Sarah R. Smith, John P. McCrow, Sean D. Gallaher, Sheila Podell, Michael Thompson, Orna Cook, Marco Morselli, Artur Jaroszewicz, Eric E. Allen, Andrew E. Allen, Sabeeha S. Merchant, Matteo Pellegrini, Mark Hildebrand, Genome and methylome of the oleaginous diatom Cyclotella cryptica reveal genetic flexibility toward a high lipid phenotype, 2016, 9, 1754-6834, 10.1186/s13068-016-0670-3
    11. Romana Bacova, Martina Kolackova, Borivoj Klejdus, Vojtech Adam, Dalibor Huska, Epigenetic mechanisms leading to genetic flexibility during abiotic stress responses in microalgae: A review, 2020, 50, 22119264, 101999, 10.1016/j.algal.2020.101999
    12. Achal Rastogi, Omer Murik, Chris Bowler, Leila Tirichine, PhytoCRISP-Ex: a web-based and stand-alone application to find specific target sequences for CRISPR/CAS editing, 2016, 17, 1471-2105, 10.1186/s12859-016-1143-1
    13. Xue Zhao, Antoine Hoguin, Timothée Chaumier, Leila Tirichine, 2022, Chapter 7, 978-3-030-92498-0, 179, 10.1007/978-3-030-92499-7_7
    14. Christina R. Steadman, 2023, 9781119821915, 383, 10.1002/9781119821946.ch17
    15. Noujoud Gabed, Frédéric Verret, Aurélie Peticca, Igor Kryvoruchko, Romain Gastineau, Orlane Bosson, Julie Séveno, Olga Davidovich, Nikolai Davidovich, Andrzej Witkowski, Jon Bent Kristoffersen, Amel Benali, Efstathia Ioannou, Aikaterini Koutsaviti, Vassilios Roussis, Hélène Gâteau, Suliya Phimmaha, Vincent Leignel, Myriam Badawi, Feriel Khiar, Nellie Francezon, Mostefa Fodil, Pamela Pasetto, Jean-Luc Mouget, What Was Old Is New Again: The Pennate Diatom Haslea ostrearia (Gaillon) Simonsen in the Multi-Omic Age, 2022, 20, 1660-3397, 234, 10.3390/md20040234
    16. Michele Ferrari, Antonella Muto, Leonardo Bruno, Radiana Cozza, DNA Methylation in Algae and Its Impact on Abiotic Stress Responses, 2023, 12, 2223-7747, 241, 10.3390/plants12020241
    17. Bingzhuang An, Haiya Cai, Bo Li, Shuo Zhang, Yonggang He, Rong Wang, Chunhai Jiao, Ying Guo, Le Xu, Yanhao Xu, Molecular Evolution of Histone Methylation Modification Families in the Plant Kingdom and Their Genome-Wide Analysis in Barley, 2023, 24, 1422-0067, 8043, 10.3390/ijms24098043
    18. Ting Hong, Jiezhang Mo, Tangcheng Li, Nan Huang, Wenhua Liu, Honghao Liang, Pengbing Pei, Ping Li, Jing Chen, Hong Du, Multi‐Omics Analysis Reveals Adaptation Strategies of Marine Diatom to Long‐Term Ocean Warming: Resource Allocation Trade‐Offs and Epigenetic Regulation, 2025, 0140-7791, 10.1111/pce.15482
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4655) PDF downloads(361) Cited by(11)

Article outline

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog