On concurvity in nonlinear and nonparametric regression models

Authors

  • Sonia Amodio Università degli Studi di Napoli Federico II
  • Massimo Aria Università degli Studi di Napoli Federico II
  • Antonio D’Ambrosio Università degli Studi di Napoli Federico II

DOI:

https://doi.org/10.6092/issn.1973-2201/4599

Keywords:

Concurvity, multicollinearity, nonparametric regression, additive models,

Abstract

When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM). The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.

References

R. E. BELLMAN (1961). Adaptive control processes: a guided tour, vol. 4. Princeton university press Princeton.

L. BREIMAN, J. H. Friedman (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80, no. 391, pp. 580–598.

A. BUJA, T. HASTIE, R. TIBSHIRANI (1989). Linear smoothers and additive models. The Annals of Statistics, pp. 453–510.

Y. A. CHEN, J. S. ALMEIDA, A. J. RICHARDS, P. MULLER, R. J. CARROLL, B. ROHRER (2010). A nonparametric approach to detect nonlinear correlation in gene expression. Journal of Computational and Graphical Statistics, 19, no. 3, pp. 552–568.

J. DE LEEUW, F. W. YOUNG, Y. TAKANE (1976). Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika, 41, no. 4, pp. 471–503.

D. J. DONNELL (1982). Additive principal components - a method for estimating equations with small variance from data. Ph.D. thesis, University ofWashington, Seattle.

D. J. DONNELL, A. BUJA, W. STUETZLE (1994). Analysis of additive dependen-cies and concurvities using smallest additive principal components. The Annals of Statistics, pp. 1635–1668.

P. H. EILERS, B. D. MARX (1996). Flexible smoothing with b-splines and penal-ties. Statistical science, pp. 89–102.

J. H. FRIEDMAN (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data mining and knowledge discovery, 1, no. 1, pp. 55–77.

J. H. FRIEDMAN, W. STUETZLE (1981). Projection pursuit regression. Journal of the American statistical Association, 76, no. 376, pp. 817–823.

P. GRASSBERGER, I. PROCACCIA (1983). Characterization of strange attractors. Physical review letters, 50, no. 5, pp. 346–349.

P. J. Green, B. W. Silverman (1993). Nonparametric regression and generalized linear models: a roughness penalty approach. CRC Press.

C. GU (1992). Diagnostics for nonparametric regression models with additive terms. Journal of the American Statistical Association, 87, no. 420, pp. 1051–1058.

C. GU, D. M. BATES, Z. CHEN, G. WAHBA (1989). The computation of generalized cross-validation functions through householder tridiagonalization with applications to the _tting of interaction spline models. SIAM Journal on Matrix Analysis and Applications, 10, no. 4, pp. 457–480.

C. GU, G. WAHBA (1991). Minimizing gcv/gml scores with multiple smoothing parameters via the newton method. SIAM Journal on Scientific and Statistical Computing, 12, no. 2, pp. 383–398.

H. GU, T. KENNEY, M. ZHU (2010). Partial generalized additive models: An information-theoretic approach for dealing with concurvity and selecting variables. Journal of Computational and Graphical Statistics, 19, no. 3, pp. 531–551.

T. HASTIE, R. TIBSHIRANI (1986). Generalized additive models. Statistical science, 1, no. 3, pp. 297–310.

T. J. HASTIE, R. J. TIBSHIRANI (1990). Generalized additive models, vol. 43. CRC Press.

G. W. STEWART (1987). Collinearity and least squares regression. Statistical Science, 2, no. 1, pp. 68–84.

C. J. STONE (1985). Additive regression and other nonparametric models. The annals of Statistics, pp. 689–705.

G. WAHBA (1990). Spline models for observational data, vol. 59. Siam.

Downloads

Published

2014-03-31

How to Cite

Amodio, S., Aria, M., & D’Ambrosio, A. (2014). On concurvity in nonlinear and nonparametric regression models. Statistica, 74(1), 85–98. https://doi.org/10.6092/issn.1973-2201/4599

Issue

Section

Articles