The mathematical structure of the genetic code: a tool for inquiring on the origin of life


  • Diego Luis Gonzalez CNR, Consiglio Nazionale delle Ricerche
  • Simone Giannerini Alma Mater Studiorum - Università di Bologna
  • Rodolfo Rosa Alma Mater Studiorum - Università di Bologna



In this paper we present a review and some new thoughts on our work about the mathematical structure of the genetic code. The model proposed is a new theoretical tool that allows a fresh insight on many open problems related to the origin, the evolution and the present structure of the genetic machinery. In particular, we show that such model implies the existence of dichotomic classes, quantities that might play a preeminent role in the management of the genetic information including error control mechanisms. We introduce and use techniques for the analysis of dependent sequences in order to study the correlation structure of series of dichotomic classes derived from protein coding segments of DNA. The results show the existence of a complex context-dependent correlation structure; such dependence gives important information about coding and decoding strategies that nature has implemented along evolutionary times on DNA and RNA sequences.


M. S. BARTLETT (1946). On the theoretical specification and sampling properties of autocorrelated time-series. Supplement to the Journal of the Royal Statistical Society, 8, no. 1, pp. 27–41.

P. BÜHLMANN (2002). Bootstraps for time series. Statistical Science, 17, pp. 52–72.

J. P. CRUTCHFIELD, D. P. FELDMAN (2003). Regularities unseen, randomness observed: levels of entropy convergence. Chaos, 13, no. 1, pp. 25–54.

B. EFRON (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7, pp. 1–26.

S. GIANNERINI, E.MAASOUMI, E. BEE DAGUM (2007). Entropy testing for nonlinearity in time series. In Bulletin of the International Statistical Institute, 56th session. ISI.

D. L. GONZALEZ (2004). Can the genetic code be mathematically described. Medical Science Monitor, 10, no. 4, pp. 11–17.

D. L. GONZALEZ, S. GIANNERINI, R. ROSA (2006). Detecting structure in parity binary sequences: Error correction and detection in dna. IEEE Engineering in Medicine and Biology Magazine, 25, pp. 69–81.

D. L. GONZALEZ, S. GIANNERINI, R. ROSA (2008). Strong short-range correlations and dichotomic codon classes in coding dna sequences. Physical Review E, 78, no. 5, p. 051918.

S.GOTTLIEB, P. B.MACKENZIE,H. B. THACKER,D.WEINGARTEN (1986). Hadronic coupling constants in lattice gauge theory. Nuclear Physics B, 263, pp. 704–730.

C.W. J.GRANGER, E.MAASOUMI, J. RACINE (2004). A dependencemetric for possibly nonlinear processes. Journal of Time Series Analysis, 25, no. 5, pp. 649–669.

P. HALL (1985). Resampling a coverage pattern. Stochastic Processes and their Applications, 20, pp. 231–246.

R. D. KNIGHT, L. F. LANDWEBER (2000). The early evolution of the genetic code. Cell, 101, no. 6, pp. 569 – 572.

H. K. KÜNSCH (1989). The jackknife and the bootstrap for general stationary observations. The Annals of Statistics, 17, pp. 1217–1241.

R. Y. LIU, K. SINGH (1992). Moving blocks jackknife and bootstrap capture weak dependence. In R. LEPAGE, L. BILLARD (eds.), Exploring the Limits of Bootstrap, Wiley, New York, pp. 225– 248.

D. N. POLITIS (2003). The impact of bootstrap methods on time series analysis. Statistical Science, 18, pp. 219–230.

B. D. RIPLEY (1987). Stochastic Simulation. Wiley, New York.

S.WOLFRAM (2002). A new kind of science. Wolfram Media, Inc., Champaign, IL.

YU. B. RUMER (1966). About the codon’s systematization in the genetic code. Proc. Acad. Sci. U.S.S.R. (Doklady), 167, pp. 1393–1394. (in Russian).




How to Cite

Gonzalez, D. L., Giannerini, S., & Rosa, R. (2009). The mathematical structure of the genetic code: a tool for inquiring on the origin of life. Statistica, 69(2/3), 143–157.