The Autoregressive metric for comparing time series models

Authors

  • Domenico Piccolo Università degli Studi di Napoli Federico II

DOI:

https://doi.org/10.6092/issn.1973-2201/3598

Abstract

The Autoregressive metric was firstly introduced in 1983 as a tool for choosing a representative element from a large collection of time series and for clustering temporal data. The proposal has been extended to many contexts and has raised increasing interests in both time series methods and applications. The main results concerning this metric, its asymptotic distribution and some operational and comparative issues are presented. A discussion about the merits of this distance criterion and some caveats about its usage conclude the paper.

References

R. AGRAWAL, C. FALOUTSOS, A. SWAMI, (1994), Efficient similarity search in sequence databases, 4th Proceedings of F.O.D.O.93 in Lecture Notes in Computer Science, Springer Verlag, New York: 69-84.

J. ALAGÓN, (1989), Spectral discrimination of two groups of time series, Journal of Time Series Analysis, 10:202-214.

A.M. ALONSO, J.R. BERRENDERO, A. HERNÁNDEZ, B. JUSTEL, (2006), Time series clustering based on forecast densities, Computational Statistics & Data Analysis, 51:762-776.

B.D.O. ANDERSON, J.B. MOORE, (1979), Optimal filtering. Prentice Hall, Englewood Cliffs.

R. BARAGONA, (2001), A simulation study on clustering time series with metaheuristic methods, Quaderni di Statistica, 3:1-26.

R. BARAGONA, F. BATTAGLIA, D. CUCINA, (2001), Clustering of time series with genetic algorithms, Metron, 59:113-130.

I.V. BASAWA, L. BILLARD, R. SRNIVASAN, (1984), Large sample tests of homogeneity for time series, Biometrika, 71:203-206.

A.B. BERNARD, S.N. DURLAUF, (1996), Interpreting tests of the convergence hypothesis, Journal of Economics, 71:1161-173.

P. BLOOMFIELD, (1973), An exponential model for specrtum of a scalar time series, Biometrika, 60:217-226.

J. BOETS, K. AND DE COCK, M. AND ESPINOZA, B. DE MORR, (2005), Clustering time series, subspace identification and cepstral distances, Communications in Informations and Systems, 5:69-96.

J. BOETS, K. DE COCK, M. ESPINOZA, B. DE MORR, (2008), Clustering of biological time series by cepstral coefficients based distances, Communications in Informations and Systems, 41:2398-2412.

B.P. BOGERT, M.J. HEALY, J.W. TUKEY, (1962), The quefrency analysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-ceptstrum and saphe cracking. In M. ROSENBLATT (eds.), Proceedings of the Symposium on Time Series Analysis, J. Wiley & Sons, New York: 209-263.

Z. BOHTE, D. CEPAR, K. KOSMELIJ, (1980), Clustering of time series, Proceeding of 1980 Computational Statistics Meeting in COMPSTAT80: 587-593.

G.E.P. BOX, G.M. JENKINS, (1970), Time series analysis: forecasting and control (revised edition, 1976). Holden-Day, San Francisco.

G.E.P. BOX, D.A. PIERCE, (1970), Distribution of the residual autocorrelations in autoregressive integrated moving average time series models, Journal of the American Statistical Association, 65:1509-1526.

P.J. BROCKWELL, R.A. DAVIES, (1991), Time series: theory and methods (2nd edition). Springer-Verlag, New York.

J.J. CACERES, V.J. CANO, F.J. MARTIN, (1993), Analysis de la representatividad del I.P.I. agregado. Documento de trabajon n.45, Universidad de la Laguna, Tenerife.

J. CAIADO, N. CRATO, D. PEÑA, (2006), A periodogram-based metric for time series classification, Computational Statistics & Data Analysis, 50:2668-2684.

J. CAIADO, N. CRATO, D. PEÑA, (2009), Comparison of time series with unequal length in the frequency domain, Communications in Statistics-Simulation and Computation, 38:527-540.

V.J. CANO, F.J. MARTIN, J.J. CACERES, (1992), Medida de distancia entre modelos ARIMA. Una aplicacion a los Indices de Precios Percibidos por los Agricultores, Investigacion Agraria, 7:33-45.

A. CERIOLI, S. INGRASSIA, A. CORBELLINI, (2004), Classificazione simbolica di dati funzionali: un’applicazione al monitoraggio ambientale. In C.N. LAURO, C. DAVINO (eds.), Data mining e analisi simbolica, F. Angeli, Milano.

G. CHAUDURY, J.D. BORWARKAR, P.R.K. RAO, (1991), Bhattacharyya distance based linear discriminant function for stationary time series, Communications in Statistics. Theory and Methods, 20:2195-2205.

D.G. CHILDERS, D.P. SKINNER, R.C. KEMERAIT, (1975), The cepstrum: a guide to processing, Proceedings of the IEEE, 65:1428-1443.

M. CORDUAS, (1984), Un programma per la rappresentazione dei modelli ARIMA, Rivista di Statistica Applicata, 17:211-214.

M. CORDUAS, (1985a), Alcune considerazioni statistiche sulla divergenza tra processi lineari, Statistica, XLV:393-401.

M. CORDUAS, (1985b), Una classificazione statistica delle serie economiche italiane mediante modelli ARIMA, Note Economiche, 6:163-178.

M. CORDUAS, (1992a), Misure di distanza tra serie storiche e modelli parametrici, Quaderni dell’Istituto Economico Finanziario, n.3. Università di Napoli Federico II.

M. CORDUAS, (1992b), Una nota sulla distanza tra modelli ARIMA per serie storiche correlate, Statistica, LII:512-520.

M. CORDUAS, (1996), Uno studio sulla distribuzione asintotica della metrica Autoregressiva, Statistica, LVI:321-332.

M. CORDUAS, (2000a), La metrica Autoregressiva tra modelli ARIMA: una procedura in linguaggio GAUSS, Quaderni di Statistica, 2:1-37.

M. CORDUAS, (2000b), Preliminary estimation of ARFIMA models, in J.G. BETLEHEM, P.G.M. VAN DER HEIJDEN (eds.), Proceedings in Computational Statistics, Physica Verlag, Heidelberg, 247-252.

M. CORDUAS, (2003), Il confronto tra serie storiche nell’analisi statistica di dati dinamici, “Atti della Riunione SIS”, Rocco Curto editore, Napoli, 213-224.

M. CORDUAS, (2004), Time series discrimination using AR metric, Proceedings of XLII Riunione Scientifica SIS, CLEUP, Padova, 143-146.

M. CORDUAS, (2007), Comparing time series: shape-based or structural similarities?, Proceedings of CLADAG-2007 Meeting, EUM,University of Macerata, 69-72.

M. CORDUAS, (2011), Clustering streamflow time series for regional classification, Journal of Hydrology, forthcoming.

M. CORDUAS, D. PICCOLO, (1995), Mutamenti strutturali della natalità e differenziazioni regionali, “Atti del Convegno SIS: “Continuità e discontinuità nei fenomeni demografici”, Università degli Studi della Calabria, Editore Rubettino, 315-322.

M. CORDUAS, D. PICCOLO, (1996), Time series clustering of the Italian Consumer Price Indices: a model approach, Quaderni di Ricerca ISTAT, Istituto Nazionale di Statistica, Roma.

M. CORDUAS, D. PICCOLO, (1999a), An application of the AR metric to seasonal adjustment, Bulletin of the International Statistical Institute, International Statistical Institute, 217-218.

M. CORDUAS, D. PICCOLO, (1999b), On the use of AR metric for seasonal adjustment, Proceedings of the International Conference CLADAG-99, University of Rome “La Sapienza”, 1-4.

M. CORDUAS, D. PICCOLO, (2001), Fractional differencing models estimations: some new approaches, in D. PICCOLO, L. UBERTINI (eds.), Metodi Statistici e Matematici per l’Analisi delle Serie Idrologiche, CNR-GNDCI n.2136, Roma, 73-79.

M. CORDUAS, D. PICCOLO, (2003), Determinazione del lag ottimale nelle stime di minima distanza del parametro alle differenze frazionarie, in D. PICCOLO, L. UBERTINI (eds.), Metodi Statistici e Matematici per l’Analisi delle Serie Idrologiche, CNR-GNDCI n.2818, Roma, 73-80.

M. CORDUAS, D. PICCOLO, (2006), Short and long memory unobserved components in hydrologic time series, Physics and Chemistry of the Earth, 31:1099-1106.

M. ORDUAS, D. PICCOLO, (2008), Time series clustering and classification by the autoregressive metric, Computational Statistics & Data Analysis, 52:1860-1872.

G.D. COSTANZO, E. SARNO, (2000), La metrica autoregressiva per la valutazione e ottimizzazione di sistemi di monitoraggio ambientale, Quaderni di Statistica, 2:205-220.

A. D’ELIA, (2000), Uno studio sull’asimmetria dello stimatore della metrica Autoregressiva, Quaderni di Statistica, 2:59-84.

A. D’ELIA, D. PICCOLO, (2002), A comparison among several methods for estimating the fractional differencing parameter, in S. KINKE, P. AHREND, L. RICHTER (eds.), Proceedings of the Compstat 2002 Conference, Humboldt-Universitat, Physica Verlag, Berlin, 1-2.

A. D’ELIA, D. PICCOLO, (2002), Stimatori di minima distanza del parametro alle differenze frazionarie, Quaderni di Statistica, 4:115-138.

G.R. DARGAHI-NOUBARY, P.J. LAYCOCK, (1981), Spectral ratio discriminants and information theory, Journal of Time Series Analysis, 2:71-86.

P.V. DE SOUZA, (1977), Statistical tests and distance measures for LPC coefficients, “IEEE Transactions on Acoustics, Speech, and Signal processing”, ASSP-25(6):554-559.

F. DI IORIO, U. TRIACCA, (2011), Testing for Non-causality by using the Autoregressive Metric. Technical report, Department TEOMESUS, University of Naples Federico II, submitted.

P. GALEANO, D. PEÑA, (2000), Multivariate analysis in vector time series, Resenhas, 4:383-404.

D. GE, N. SRINIVASAN, S.M. KRISHNAN, (2002), Cardiac arrhythmia classification using autoregressive modeling, Biomedical Engineering OnLine, www.biomedical-engineering-online.com.

W. GERSH, F. MARTINELLI, J. YONEMOTO, M.D. LOW, J.A.MCEWAN, (1979), Automatic classification of electroencephalograms: Kullback-Liebler nearest neighbor rules, Science, 205:193-195.

J. GONZALO, T.H. LEE, (1996), Relative power of t type tests for stationary and unit root processes, Journal of Time Series Analysis, 17:37-47.

A.H. GRAY, J.D. AND MARKEL, (1976), Distance measures for speech processing, “IEEE Transactions on Acoustics, Speech and Signal Processing”, ASSP-24:380-391.

S. GRIMALDI, (2004), Linear parametric models applied on daily hydrological series, Journal of Hydrological Engineering, 9:383-391.

M. IANNARIO, D. AND PICCOLO, (2011), Spectral decomposition of the AR metric, “Proceedings of the SIS Conference”, in press, SIS-2010, Springer-Verlag, Berlin.

S. INGRASSIA, A. CERIOLI, A. CORBELLINI, (2003), Some issues on clustering of functional data, in M. SCHADER, W. GAUL, M. VICHI (eds.), Between Data Science and Applied Data Analysis, Springer, Berlin, 49-56.

T. KAILATH, (1967), The divergence and Bhattacharyya distance measures in signal selection, “IEEE Transactions on Communications in Technology”, 15:52-60.

Y. KAKIZAWA, R.H. SHUMWAY, M. TANIGUCHI, (1998), Discrimination and clustering for multivariate time series, Jounral of the American Statistical Association, 93:328-340.

K. KALPAKIS, D. GADA, V. PUTTAGUNTA, (2001), Distance measures for effective clustering of ARIMA time series, “Proceedings of the IEEE International Conference on Data Mining”, ICDM’01, San Jose, California, 273-280.

W. KANG, C. CHENG, J. LAI, H. TSAO, (1995), The application of cepstral coefficients and maximum likelihood method in EGM pattern recognition, “IEEE Transactions on Biomedical Engineering”, 42:777-785.

D. KAZAKOS, P. PAPANTONI-KAZAKOS, (1980), Spectral distances between Gaussian processes, “IEEE Transactions on Automatic Control”, 25:950-959.

D. KOSĔC, (2000), Parametric estimation of continuous non stationary spectrum and its dynamics in surface EMG studies, International Journal of Medical Informatics, 58/59:59-69.

Z.J. KOVAČIĆ, (1996), Classification of time series with application to the leading indicator selection, “Proceedings of the fifth Conference of IFCS”, number 2, 204-207.

L.M. LI, (2004), Some notes on mutual information between past and future, Journal of Time Series Analysis, 27:309-322.

T.W. LIAO, (2005), Clustering time series data - a survey, Pattern Recognition, 38:1857-1874.

F. LISI, E. OTRANTO, (2010), Clustering Mutual Funds by Return and Risk Levels, in M. CORAZZA, C. PIZZI (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance, MAF2010, Springer-Verlag, Berlin, 183-191.

G.M. LJUNG, G.E.P. BOX, (1978), On a measure of lack of fit in time series models, Biometrika, 65:297-303.

E.A. MAHARAJ, (1996), A significance test for classifying ARMA models, Journal of Statistical Computation and Simulation, 54:305-331.

E.A. MAHARAJ, (2000), Clusters of Time Series, Journal of Classification, 17:297-314.

E.A. MAHARAJ, (1999), Comparison and classification of stationary multivariate time series, Pattern Recognition, 32:1129-1138.

E.A. MAHARAJ, P. D’URSO, (2011), Fuzzy clustering of time series in the frequency domain, Information Sciences, 181:1187-1211.

R.J. MARTIN, (2000), A metric for ARMA processes, “IEEE Transactions on Signal Processing”, 48:1164-1170.

A.C. MONTI, (1994), A proposal for residual autocorrelation test in linear models, Biometrika, 81:776-780.

G. MÉLARD, R. ROY, (1984), Sur un test d’égalité des autocovariances de deux series chronologiques, Canadian Journal of Statistics, 12:333-342.

M.K. NG, Z. HUANG, (1999), Data mining massive time series astronomical data: challenges, problems and solutions, Information and Software Technology, 41:545-556.

E. OTRANTO, (2004), Classifying the markets volatility with ARMA distance measures, Quaderni di Statistica, 6:1-19.

E. OTRANTO, (2008), Clustering Heteroskedastic Time Series by Model-Based Procedures, Computational Statistics & Data Analysis, 52:4685-4698.

E. OTRANTO, (2009), Improving the Forecasting of Dynamic Conditional Correlation: a Volatility Dependent Approach. Working Paper 2009/17, submitted; under review, CRENoS.

E. OTRANTO, (2010), Identifying Financial Time Series with Similar Dynamic Conditional Correlation, Computational Statistics & Data Analysis, 54:1-15.

E. OTRANTO, U. TRIACCA, (2002), Measures to evaluate the discrepancy between direct and indirect model-based seasonal adjustment, Journal of Official Statistics, 18:511-530.

E. OTRANTO, A. TRUDDA, (2008a), Classifying the Italian pension funds via GARCH distance, in C. PERNA, M. SIBILLO (eds.), Mathematical and Statistical Methods for Insurance and Finance, MAF2008, Springer-Verlag, Berlin, 189-197.

E. OTRANTO, A. TRUDDA, (2008b), Evaluating the risk of pension funds by statistical procedures, in G.M. LAKATOS (eds.), Transition Economies: 21st Century Issues and Challenges, Chapter 7, Nova Science Publisher, Hauppauge, NY, 189-204.

G. PALOMBA, E. SARNO, A. ZAZZARO, (2008), Testing similarities of short-run inflation dynamics among EU-25 countries after the Euro, Empirical Economics, 37:231-270.

D. PEÑA, (1990), Influential observation in time series, Journal of Business and Economic Statistics, 8:235-242.

D. PICCOLO, (1972), D. Analisi statistica dei prezzi all’ingrosso in Italia:1956-71, Rassegna Economica, XXXVI:1555-1599.

D. PICCOLO, (1984a), Una topologia per la classe dei processi ARIMA, Statistica, XLIV:47-59, a.

D. PICCOLO, (1984b), Una rappresentazione multidimensionale per modelli statistici dinamici, “Atti della XXXII Riunione Scientifica della SIS”, pp. 149-160, b.

D. PICCOLO, (1987), Problemi di confronto in rappresentazioni alternative di fenomeni dinamici, Quaderni di Statistica e Econometria, IX:1-10.

D. PICCOLO, (1989), On a measure of dissimilarity between ARIMA models, “Proceedings of the A.S.A. Meetings, Business and Economic Statistics Section”, pp. 231-236, ASA, Washington D.C.

D. PICCOLO, (1990), A distance measure for classifying ARIMA models, Journal of Time Series Analysis, 11:153-164.

D. PICCOLO, (2007), Statistical issues on the AR metric in time series analysis, “Proceedings of the 2007 Intermediate Conference”, pp. 221-232, SIS-2007, CLEUP, Padova.

D. PICCOLO, M. CORDUAS, (2006), Spectral approximation to the fractional differencing operator, in D. PICCOLO, L. UBERTINI (eds.), Metodi Statistici e Matematici per l’Analisi delle Serie Idrologiche, CNR-GNDCI n.2908, Roma, 11-23.

E.M. QUILIS, (1990), Una aplicación de los modelos BVAR estacionales, Economía, Instituto Nacional de Estadistíca, Madrid, 4:207-214.

E. SARNO, (2000), The behaviour of the AR metric for MA models comparisons, in SIS(eds.), “Proceedings of the XL Scientific Conference”, University of Florence, Firenze, 149-152.

E. SARNO, (2001), Further results on the asymptotic distribution of the Euclidean distance between MA models, Quaderni di Statistica, 3:165-175.

E. SARNO, (2005), Testing information redundancy in environmental monitoring networks, Environmetrics, 16:71-79.

E. SARNO, A. ZAZZARO, (2002), An index of dissimilarity among time series: an application to the inflation rates of the EU countries, in S. KLINKE, P. AHREND, L. RICHTER (eds.), Proceedings of COMPSTAT 2002, Springer, Berlin, 1-2.

R.H. SHUMWAY, (1982), Discriminant analysis for time series, in P.R. KRISHNAIAH, L.N. KANAL (eds.), Handbook of Statistics, vol. 2., North Holland, Amsterdam, 1-46.

R.H. SHUMWAY, (2003), Time-frequency clustering and discriminant analysis, Statistics and Probability Letters, 63:307-314.

R.H. SHUMWAY, A.N. UNGER, (1974), Linear discriminant functions for stationary time series, Journal of the American Statistical Association, 65:1527-1546.

Z.R. STRUZIK, A. SIEBES, (1999), The Haar wavelet in the time series similarity paradigm, “Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery”, Springer, Prague, 12-22.

P.J. THOMSON, P. DE SOUZA, (1985), Speech recognition using LPC distance measures, in E.J. HANNAN,

P.R. KRISHNAIAH, M.M. RAO (eds.), Handbook of Statistics, North Holland, Amsterdam, 389-412.

H. TONG, P. DABAS, (1990), Cluster of time series, Journal of Applied Statistics, 17:187-198.

T.D. TRAN-LUU, N. DECLARIS, (1997), Visual heuristics for data clustering, “IEEE Transactions on Systems”, Man and Cybernetics, 1:19-24.

U. TRIACCA, (2004a), Feedback, causality and distance between ARMA models, Mathematics and Computers in Simulation, 64:679-685.

U. TRIACCA, (2004b), A note on distance and parallelism between two ARIMA processes, Quaderni di Statistica, 6:21-29.

G. TUNNICLIFFE WILSON, (1979), Some efficient computational procedure for high order ARMA models, Journal of Statistical Computation and Simulation, 8:301-309.

S. ZANI, (1983), Osservazioni sulle serie storiche multiple e l’analisi dei gruppi, in D. PICCOLO (eds.), Analisi moderna delle serie storiche, Convegno nazionale 1981, F. Angeli, Milano, 263-274.

G. ZHANG, M. TANIGUCHI, (1995), Nonparametric approach for discriminant analysis in time series, Nonparametric Statistics, 5:91-101.

Downloads

How to Cite

Piccolo, D. (2010). The Autoregressive metric for comparing time series models. Statistica, 70(4), 459–480. https://doi.org/10.6092/issn.1973-2201/3598

Issue

Section

Articles