Evolutionary Computation Methods and their applications in Statistics


  • Francesco Battaglia Università di Roma “La Sapienza”




A brief discussion of the genesis of evolutionary computation methods, their relationship to artificial intelligence, and the contribution of genetics and Darwin’s theory of natural evolution is provided. Then, the main evolutionary computation methods are illustrated: evolution strategies, genetic algorithms, estimation of distribution algorithms, differential evolution, and a brief description of some evolutionary behavior methods such as ant colony and particle swarm optimization. We also discuss the role of the genetic algorithm for multivariate probability distribution random generation, rather than as a function optimizer. Finally, some relevant applications of genetic algorithm to statistical problems are reviewed: selection of variables in regression, time series model building, outlier identification, cluster analysis, design of experiments.


T. BACK (1996). Evolutionary algorithms in theory and practice. Oxford University Press,Oxford.

K. BALCOMBE (2005). Model selection using information criteria and genetic algorithms. Computational Economics, 25, pp. 207–228.

S. BANDYOPADHYAY, U. MAULIK (2002). Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognition, 35, pp. 1197–1208.

R. BARAGONA (2001). A simulation study on clustering time series with metaheuristic methods. Quaderni di Statistica, 3, pp. 1–26.

R. BARAGONA, F. BATTAGLIA (2009). Evolutionary computing in statistical data analysis. In A. ABRAHAM, A.-E.HASSANIEN, P. SIARRY, A. ENGELBRECHT (eds.), Foundations of Computational Intelligence vol 3 - Global Optimization (Studies in Computational Intelligence vol. 203), Springer, Berlin/Heidelberg, pp. 347–386.

R. BARAGONA, F. BATTAGLIA, C. CALZINI (2001). Genetic algorithms for the identification of additive and innovation outliers in time series. Computational Statistics & Data Analysis, 37, pp. 1–12.

R. BARAGONA, F. BATTAGLIA, D. CUCINA (2004). Fitting piecewise linear threshold autoregressive models by means of genetic algorithms. Computational Statistics & Data Analysis, 47, pp. 277–295.

R. BARAGONA, D. CUCINA (2008). Double threshold autoregressive conditionally heteroscedastic model building by genetic algorithms. Journal of Statistical Computation and Simulation, 78, pp. 541–558.

F. BATTAGLIA (2001). Genetic algorithms, pseudo-random numbers generators, and Markov chain Monte Carlo methods. Metron, 59, pp. 131–155.

H. G. BEYER, H. SCHWEFEL (2002). Evolution strategies, a comprehensive introduction. Natural Computing, 1, pp. 3–52.

H. BOZDOGAN, P. BEARSE (2003). Icomp: A new model-selection criterion. In H. H. BOCK (ed.), Classification and Related Methods of Data Analysis, Elsevier Science Publishers (North Holland), Amsterdam, pp. 599–608.

S. CHATTERJEE, M. LAUDATO, L. A. LYNCH (1996). Genetic algorithms and their statistical applications: an introduction. Computational Statistics & Data Analysis, 22, pp. 633–651.

M. CHIOGNA, C.GAETAN, G.MASAROTTO (2008). Automatic identification of seasonal transfer function models by means of iterative stepwise and genetic algorithms. Journal of Time Seies Analysis, 29, pp. 37–50.

K. D. CRAWFORD, R. L. WAINWRIGHT (1995). Applying genetic algorithms to outlier detection. In L. J. ESHELMAN (ed.), Proceedings of the Sixth International Conference on Genetic Algorithms, Morgan Kaufmann, San Mateo, pp. 546–550.

R. DAVIS, T. LEE, G. RODRIGUEZ-YAM (2006). Structural break estimation for nonstationary time series models. Journal of the American Statistical Association, 101, pp. 223–239.

K. A. DE JONG (1993). Genetic algorithms are not function optimizers. In D. WHITLEY (ed.), Foundations of Genetic Algorithms 2, Morgan Kaufman, San Mateo, pp. 1–18.

M. DORIGO (1992). Optimization, Learning and Natural Algorithms. Ph.D. thesis, Politecnico di Milano, Italy.

M.DORIGO,M. GAMBARDELLA (1997). Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1, pp. 53–66.

M. DORIGO, T. STÜTZLE (2004). Ant Colony Optimization. MIT Press, Cambridge.

M. DRUGAN, D. THIERENS (2004). Evolutionary Markov chain Monte Carlo. In P. LIARDET (ed.), Proc. Sixth Intern. Conf. on Artificial Evolution - EA 2003. Springer, Berlin, pp. 63–76.

D. B. FOGEL (1998). Evolutionary computation: toward a new philosophy of machine intelligence. IEEE Press, New York.

L. J. FOGEL, A. J. OWENS, M. J. WALSH (1966). Artificial intelligence through simulated evolution. Wiley, New York.

M. FORLIN, I. POLI, D. DE MARCH, N. PACKARD, G. GAZZOLA, R. SERRA (2008). Evolutionary experiments for self-assembling amphiphilic systems. Chemometrics and Intelligent Laboratory Systems, 90, pp. 153–160.

C. GAETAN (2000). Subset arma model identification using genetic algorithms. Journal of Time Series Analysis, 21, pp. 559–570.

C. J. GEYER (1991). Markov chain Monte Carlo maximum likelihood. In E.M. KERAMIDAS (ed.), Computing Science and Statistics, Proc. 23rd Symposium on the Interface. Interface Foundation, Fairfax Station, pp. 156–163.

W. R. GILKS, R. RICHARDSON, D. J. SPIEGELHALTER (1996). Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC.

G. GOSWAMI, J. S. LIU (2007). On learning strategies for evolutionary Monte Carlo. Statistics and Computing, 17, pp. 23–38.

J. HANDL, J. KNOWLES, M. DORIGO (2003). Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1 d-som. Technical Report TR/IRIDIA/2003-24, IRIDIA, Université Libre de Bruxelles, Belgium. Http://wwwcip.informatik.uni-erlangen.de/ˆsijuhand/TR-IRIDIA-2003-24.pdf.

C. C. HOLMES, B. K. MALLICK (1998). Parallel Markov chain Monte Carlo sampling. mimeo, Dept. of Mathematics, Imperial College, London.

G. KAPETANIOS (2007). Variable selection in regression models using nonstandard optimisation of information criteria. Computational Statistics & Data Analysis, 52, pp. 4–15.

J. KENNEDY, R. EBERHART (1995). Particle swarm optimization. In Proc. IEEE Conference on Neural Networks. Piscataway, pp. 1942–48.

J. KENNEDY, R. EBERHART (2001). Swarm Intelligence. Morgan Kaufmann, San Mateo.

H.-J. KIM, K.-S. SHIN (2007). A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets. Applied Soft Computing, 7, pp. 569–576.

J. R. KOZA (1992). Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge.

P. LARRAÑAGA, J. A. LOZANO (2002). Estimation of distribution algorithms: a new tool for evolutionary optimization. Kluwer, Boston.

F. LIANG,W.H.WONG (2000). Evolutionary Monte Carlo: applications to cp model sampling and change point problem. Statistica Sinica, 10, pp. 317–342.

F. LIANG, W. H. WONG (2001). Real-parameter evolutionary Monte Carlo with applications to Bayesian mixture models. Journal of the American Statistical Association, 96, pp. 653–666.

J. A. LOZANO, P. LARRAÑAGA, I. INZA, G. BENGOETXEA (2006). Towards a new evolutionary computation. Advances in estimation of distribution algorithms. Springer, Berlin.

E. MARINARI, G. PARISI (1992). Simulated tempering: a new Monte Carlo scheme. Europhysics Letters, 19, pp. 451–458.

U. MAULIK, S. BANDYOPADHYAY (2003). Fuzzy partitioning using real coded variable length genetic algorithmfor pixel classification. IEEE Transactions onGeoscience and Remote Sensing, 41, pp. 1075–1081.

T. MINERVA, S. PATERLINI (2002). Evolutionary approaches for statistical modelling. In D. B.

FOGEL,M. A. EL-SHARKAM, G. YAO,H. GREENWOOD, P. IBA, P. MARROW,M. SHAKLETON (eds.), Evolutionary Computation 2002. Proceedings of the 2002 Congress on Evolutionary Computation. IEEE Press, Piscataway, vol. 2, pp. 2023–2028.

C. A. MURTHY, N. CHOWDHURY (1996). In search of optimal clusters using genetic algorithms. Pattern Recognition Letters, 17, pp. 825–832.

C. S.ONG, J. J.HUANG,G.H. TZENG (2005). Model identification of ARIMA family using genetic algorithms. Appl. Math. Comput., 164, pp. 885–912.

K. V. PRICE, R. STORN, J. LAMPINEN (2005). Differential evolution, a practical approach to global optimization. Springer, Berlin.

C. R. REEVES, J. E. ROWE (2003). Genetic algorithms - Principles and Perspective: A Guide to GA Theory. Kluwer, London.

G. O. ROBERTS, W. R. GILKS (1994). Convergence of adaptive direction sampling. Journal of Multivariate Analysis, 49, pp. 287–294.

A. ROVERATO, I. POLI (1998). A genetic algorithm for graphical model selection. Journal of the Italian Statistical Society, 7, pp. 197–208.

R. SABATIER, C. REYNÉS (2008). Extensions of simple component analysis and simple linear discriminant analysis using genetic algorithms. Computational Statistics & Data Analysis, 52, pp. 4779–4789.

T. STÜTZLE,H. H. HOOS (2000). Max–min ant systems. Future Generation Computer Systems, 16, pp. 889–914.

L. Y. TSENG, S. B. YANG (2001). A genetic approach to the automatic clustering problem. Pattern Recognition, 34, pp. 415–424.

D. H. WOLPERT, W. G. MACREADY (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation., 1, pp. 67–82.

B. WU, C.-L. CHANG (2002). Using genetic algorithms to parameters (d,r) estimation for threshold autoregressive models. Computational Statistics & Data Analysis, 38, pp. 315–330.




How to Cite

Battaglia, F. (2009). Evolutionary Computation Methods and their applications in Statistics. Statistica, 69(2/3), 201–224. https://doi.org/10.6092/issn.1973-2201/3555