Special Issues
Genome characterization through dichotomic classes: An analysis of the whole chromosome 1 of A. thaliana
-
1.
Dipartimento di Scienze Statistiche, Università di Bologna, Via delle Belle Arti 41, 40126, Bologna
-
2.
CNR-IMM, UOS di Bologna, Via Gobetti 101, 40129 Bologna
-
3.
Dipartimento di Scienze Statistiche, Università di Bologna, Via delle Belle Arti 41, 40126 Bologna
-
Received:
01 May 2012
Accepted:
29 June 2018
Published:
01 December 2012
-
-
MSC :
92B05, 92D20, 62P10.
-
-
In this article we show how dichotomic classes, binary variables naturally derived from a new mathematical model of the genetic code, can be used in order to characterize different parts of the genome. In particular, we analyze and compare different parts of whole chromosome 1 of Arabidopsis thaliana: genes, exons, introns, coding sequences (CDS), intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task we encode each sequence in the 3 possible reading frames according to the definitions of the dichotomic classes (parity, Rumer and hidden). Then, we perform a statistical analysis on the binary sequences. Interestingly, the results show that coding and non-coding sequences have different patterns and proportions of dichotomic classes. This suggests that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Moreover, such patterns seem to be more enhanced in CDS than in exons. Also, we derive an independence test in order to assess whether the percentages observed could be considered as an expression of independent random processes. The results confirm that only genes, exons and CDS seem to possess a dependence structure that distinguishes them from i.i.d sequences. Such informational content is independent from the global proportion of nucleotides of a sequence. The present work confirms that the recent mathematical model of the genetic code is a new paradigm for understanding the management and the organization of genetic information and is an innovative tool for investigating informational aspects of error detection/correction mechanisms acting at the level of DNA replication.
Citation: Enrico Properzi, Simone Giannerini, Diego Luis Gonzalez, Rodolfo Rosa. Genome characterization through dichotomic classes: An analysis of the whole chromosome 1 of A. thaliana[J]. Mathematical Biosciences and Engineering, 2013, 10(1): 199-219. doi: 10.3934/mbe.2013.10.199
Related Papers:
[1] |
Antonio Di Crescenzo, Fabio Travaglino .
Probabilistic analysis of systems alternating for state-dependent dichotomous noise. Mathematical Biosciences and Engineering, 2019, 16(6): 6386-6405.
doi: 10.3934/mbe.2019319
|
[2] |
Hao Zhu, Nan Wang, Jonathan Z. Sun, Ras B. Pandey, Zheng Wang .
Inferring the three-dimensional structures of the X-chromosome during X-inactivation. Mathematical Biosciences and Engineering, 2019, 16(6): 7384-7404.
doi: 10.3934/mbe.2019369
|
[3] |
Ying-Cheng Lai, Kwangho Park .
Noise-sensitive measure for stochastic resonance in biological oscillators. Mathematical Biosciences and Engineering, 2006, 3(4): 583-602.
doi: 10.3934/mbe.2006.3.583
|
[4] |
Yu Jin, Zhe Ren, Wenjie Wang, Yulei Zhang, Liang Zhou, Xufeng Yao, Tao Wu .
Classification of Alzheimer's disease using robust TabNet neural networks on genetic data. Mathematical Biosciences and Engineering, 2023, 20(5): 8358-8374.
doi: 10.3934/mbe.2023366
|
[5] |
Yoichi Enatsu, Yukihiko Nakata .
Stability and bifurcation analysis of epidemic models with saturated incidence rates: An application to a nonmonotone incidence rate. Mathematical Biosciences and Engineering, 2014, 11(4): 785-805.
doi: 10.3934/mbe.2014.11.785
|
[6] |
Elena Fimmel, Yury S. Semenov, Alexander S. Bratus .
On optimal and suboptimal treatment strategies for a mathematical model of leukemia. Mathematical Biosciences and Engineering, 2013, 10(1): 151-165.
doi: 10.3934/mbe.2013.10.151
|
[7] |
Virginia L. Ma, Shili Lin .
Examining the rare disease assumption used to justify HWE testing with control samples. Mathematical Biosciences and Engineering, 2020, 17(1): 73-91.
doi: 10.3934/mbe.2020004
|
[8] |
Yuanyuan Huang, Yiping Hao, Min Wang, Wen Zhou, Zhijun Wu .
Optimality and stability of symmetric evolutionary games with applications in genetic selection. Mathematical Biosciences and Engineering, 2015, 12(3): 503-523.
doi: 10.3934/mbe.2015.12.503
|
[9] |
Liqiang Zhu, Ying-Cheng Lai, Frank C. Hoppensteadt, Jiping He .
Characterization of Neural Interaction During Learning and Adaptation from Spike-Train Data. Mathematical Biosciences and Engineering, 2005, 2(1): 1-23.
doi: 10.3934/mbe.2005.2.1
|
[10] |
Abhyudai Singh, Roger M. Nisbet .
Variation in risk in single-species discrete-time models. Mathematical Biosciences and Engineering, 2008, 5(4): 859-875.
doi: 10.3934/mbe.2008.5.859
|
-
Abstract
In this article we show how dichotomic classes, binary variables naturally derived from a new mathematical model of the genetic code, can be used in order to characterize different parts of the genome. In particular, we analyze and compare different parts of whole chromosome 1 of Arabidopsis thaliana: genes, exons, introns, coding sequences (CDS), intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task we encode each sequence in the 3 possible reading frames according to the definitions of the dichotomic classes (parity, Rumer and hidden). Then, we perform a statistical analysis on the binary sequences. Interestingly, the results show that coding and non-coding sequences have different patterns and proportions of dichotomic classes. This suggests that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Moreover, such patterns seem to be more enhanced in CDS than in exons. Also, we derive an independence test in order to assess whether the percentages observed could be considered as an expression of independent random processes. The results confirm that only genes, exons and CDS seem to possess a dependence structure that distinguishes them from i.i.d sequences. Such informational content is independent from the global proportion of nucleotides of a sequence. The present work confirms that the recent mathematical model of the genetic code is a new paradigm for understanding the management and the organization of genetic information and is an innovative tool for investigating informational aspects of error detection/correction mechanisms acting at the level of DNA replication.
References
[1]
|
Cambridge University Press, Cambridge, 2010.
|
[2]
|
Trends in genetics, 24 (2008), 344-352.
|
[3]
|
National Center for Biotechnology Information (NCBI), (2008-04-07). Retrieved 2010-03-10.
|
[4]
|
Medical Science Monitor, 10 (2004), 11-17.
|
[5]
|
in "The Codes of Life: the Rules of Macroevolution, volume 1 of Biosemiotics. Chapter 17" (eds. M. Barbieri and J. Hoffmeyers), Springer Netherlands, (2008), 379-394.
|
[6]
|
in "The Codes of Life: the Rules of Macroevolution, volume 1 of Biosemiotics. Chapter 8" (eds. M. Barbieri and J. Hoffmeyers), Springer Netherlands, (2008), 111-152.
|
[7]
|
IEEE Engineering in Medicine and Biology Magazine, 25 (2006), 69-81.
|
[8]
|
Physical review E, 78 (2008), 051918.
|
[9]
|
Statistica, LXIX (2009), 143-157.
|
[10]
|
Journal of Theoretical Biology, 275 (2011), 21-28.
|
[11]
|
Philosophical Transactions of the Royal Society. Series A, 370 (2012), 2987-3006.
|
[12]
|
Systema Naturae, Annali di Biologia Teorica, 5 (2003), 219-236.
|
[13]
|
Nature, 409 (2001), 860-921.
|
[14]
|
in "Encyclopedia of Life Sciences," John Wiley & sons, (2006).
|
[15]
|
Brookhaven Symposia in Biology, 23 (1972), 366-370.
|
[16]
|
Nature, 441 (2006), 398-401.
|
[17]
|
Science (New York, N. Y.), 316 (2007), 1556-1-557.
|
[18]
|
PhD Thesis, University of Bologna.
|
[19]
|
Cell. Mol. Life Sci., 69 (2012), 2041-2055.
|
[20]
|
R Foundation for Statistical Computing, Vienna, Austria, (2012), http://www.R-project.org/.
|
[21]
|
Nature, 408 (2000), 796-815.
|
[22]
|
http://www.arabidopsis.org/
|
[23]
|
Front. Plant Sci., 2 (2011).
|
[24]
|
Science, 291 (2001), 1304-1351.
|
[25]
|
in "Encyclopedia of Life Sciences," John Wiley & sons, 2006.
|
-
-
-
-