Improving the estimation of multiple correlated dietary effects on colon-rectum cancer in multicentric studies:


  • Giulia Roli Alma Mater Studiorum - Università di Bologna
  • Paola Monari Alma Mater Studiorum - Università di Bologna



The paper deals with the analysis of the effects of multiple exposures on the occurrenceof a disease in observational case-control studies. We consider the case of multilevel data, with subjects nested in spatial clusters. As a result, we often face problems of small and sparse data, along with correlations among the exposures and the observations, which both invalidate the results from the ordinary analyses. A hierarchical Bayesian model is here proposed to manage the within-cluster dependence and the correlation among the exposures. We assign prior distributions on the crucial parameters by exploiting additional information at different levels and by making suitable assumptions according to the problem at hand. The model is conceived to be applied to a real multi-centric study aiming at investigating the association of dietary exposures with colon-rectum cancer occurrence. Compared with results obtained with conventional regressions, the hierarchical Bayesian model is shown to yield great gains in terms of more consistent and less biased estimates. Thanks to its flexibility, this approach represents a powerful statistical tool to be adopted in a wide range of applications. Moreover, the specification of more realistic priors may facilitate and extend the use of Bayesian solutions in the epidemiological field.


L. BERNARDINELLI, D. CLAYTON, C. PASCUTTO, C. MONTOMOLI, M. GHISLANDI, M. SONGINI, (1995), Bayesian analysis of space-time variation in disease risk, “Statistics in Medicine”, 14, pp. 2433-2443.

J.F. JR BURGESS, C.L. CHRISTIANSEN, S.E. MICHALAK, C.N. MORRIS, (2000), Medical profiling: improving standards and risk adjustments using hierarchical models, “Journal of Health Economics”, 19, pp. 291-309.

B. CARLIN, T. LOUIS, (1998), Bayes and empirical Bayes methods for data analysis, Chapman and Hall, New York.

C. CUBBIN, M.A. WINKLEBY, (2005), Protective and harmful effects of neighborhood-level deprivation on individual-level health knowledge, behavior changes, and risk of coronary heart disease, “American Journal of Epidemiology”, 162, pp. 559-68.

J.J. DEELEY, D.V. LINDLEY, (1981), Bayes Empirical Bayes, “Journal of the American Statistical Association”, 76, pp. 833-841.

A.V. DIEZ-ROUX, (2000), Multilevel anlaysis in public health research, “Annual Review of Public Health”, 21, pp. 171-92.

A.V. DIEZ-ROUX, (2004), The study of group-level factors in epidemiology: rethinking variables, study designs, and analytical approaches, “Epidemiologic Reviews”, 26, pp. 104-111.

A. GELMAN, J.B. CARLIN, H.S. STERN, D.B. RUBIN, (2003), Bayesian Data Analysis, 2nd edn., Chapman and Hall, New York.

A. GELMAN, J. HILL, (2007), Data analysis using regression and multilevel/hierarchical models, Cambridge University press.

H. GOLDSTEIN, (1999), Multilevel statistical models, John Wiley, New York.

P. GRAHAM, (2008), Intelligent Smoothing Using Hierarchical Bayesian Models, “Epidemiology”, 19, pp. 493-495.

S. GREENLAND, (1992), A semi-Bayes approach to the analysis of correlated multiple associations, with an application to an occipational cancer-mortality study, “Statistics in Medicine”, 11, pp. 219-230.

S. GREENLAND, (1993), Methods for epidemiologic analysis of multiple exposures: a review and a comparative study of maximum-likelihood, preliminary testing and empirical Bayes regression, “Statistics in Medicine”, 12, pp. 717-736.

S. GREENLAND, (1997), Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analysis, “Statistics in Medicine”, 16, pp. 515-526.

S. GREENLAND, (2000), Principles of multilevel modelling, “International Journal of Epidemiology”, 29, pp. 158-167.

S. GREENLAND, (2006), Bayesian perspectives for epidemiological research: I. Foundations and basic methods, “International Journal of Epidemiology”, 35, pp. 765-775.

S. GREENLAND, (2007), Bayesian perspectives for epidemiological research: II. Regression analysis, “International Journal of Epidemiology”, 36, pp. 195-202. J.J. HOX, (1995), Applied multilevel analysis, TT-Pubblikaties, Amsterdam.

A.B. LAWSON, (2001), Disease map reconstruction, “Statistics in Medicine”, 20, pp. 2183-2204.

A. LEYLAND, H. GOLDSTEIN, (2001), Multilevel modelling of health statistics, John Wiley, New York.

R.F. MACLEHOSE, D.B. DUNSON, A.H. HERRING, J.A. HOPPIN, (2007), Bayesian methods for highly correlated exposure data, “Epidemiology”, 18, pp. 199-207.

J. MARITZ, T. LWIN, (1989), Empirical Bayes Methods, Chapman and Hall, New York.

C. MORRIS, (1983), Parametric empirical Bayes; theory and applcations (with discussion), “Journal of the American Statistical Association”, 178, pp. 47-65.

S.W. RAUDENBUSH, A.S. BRYK, (2002), Hierarchical Linear Models - Application and data analysis methods, Second edition, Sage Publications, London.

E. RIBOLI, R. KAAKS, (1997), The EPIC Project: rationale and study design. European Prospective Investigation into Cancer and Nutrition, “International Journal of Epidemiology”, 26(1), pp. 6-14.

E. RIBOLI, K.J. HUNT, N. SLIMANI, ET AL., (2002), European Prospective Investigation into Cancer and Nutrition (EPIC): study populations and data collection, “Public Health Nutrition”, 5(6B), pp. 1113-24.

G. ROLI, (2006), Hierarchical logistic regression in a multicentric study of multiple dietary effects on a disease outcome: a fully Bayesian approach. PHD thesis.

K.J. ROTHMAN, S. GREENLAND, T.L. LASH, (2008), Modern epidemiology, 3rd ed., Lippincott-Williams-Wilkins, Philadelphia.

T. SNIJDERS, R. BOSKER, (1999), Multilevel analysis: an introduction to basic and advanced multilevel modeling, Sage Publications, London.

D. SPIEGELHALTER, A. THOMAS, N. BEST, D. LUNN, (2003), WinBUGS User Manual, Version 1.4.

D.C. THOMAS, J. SIEMIATYCKI, R. DEWAR, J. ROBINS, M. GOLDBERG, B.G. ARMSRTONG, (1985), The problem of multiple inference in studies designed to generate hypotheses, “American Journal of Epidemiology”, 122, pp. 1080-1095.

J. WITTE, S. GREENLAND, R. HAILE, C. BIRD, (1994), Hierarchical regression analysis applied to a study of multiple dietray exposures and breast cancer, “Epidemiology”, 5 (6), pp. 612-621.

J. WITTE, S. GREENLAND, L.L. KIM, (1998), Software for Hierarchical Modeling of Epidemiological Data, “Epidemiology”, 9(5), pp. 563-566.

J. WITTE, S. GREENLAND, L.L. KIM, L. ARAB, (2000), Multilevel Modeling in Epidemiology with GLIMMIX, “Epidemiology”, 11(6), pp. 684-688.




How to Cite

Roli, G., & Monari, P. (2011). Improving the estimation of multiple correlated dietary effects on colon-rectum cancer in multicentric studies:. Statistica, 71(4), 437–452.