Small area estimation with covariates perturbed for disclosure limitation


  • Silvia Polettini Università di Roma “La Sapienza”
  • Serena Arima Università di Roma “La Sapienza”



Disclosure limitation, Hierarchical Bayesian models, measurement error, PRAM, small area


We exploit the connections between measurement error and data perturbation for disclosure limitation in the context of small area estimation. Our starting point is the model in Ybarra and Lohr (2008), where some of the covariates (all continuous) are measured with error. Using a fully Bayesian approach, we extend the aforementioned model including continuous and categorical auxiliary variables, both possibily perturbed by disclosure limitation methods, with masking distributions fixed according to the assumed protection mechanism. In order to investigate the feasibility of the proposed method, we conduct a simulation study exploring the effect of different post-randomization scenarios on the small area model.


S. ARIMA, G. DATTA, B. LISEO (2012a). Bayesian estimators for small area models when auxiliary information is measured with error. Scandinavian Journal of Statistics, in press.

S. ARIMA, G. DATTA, B. LISEO (2012b). Objective Bayesian analysis of a measurement error small area model. Bayesian Analysis, 7(2), pp. 363–384.

R. BRAND (2002). Microdata protection through noise addition. In J. Domingo-Ferrer (ed.), Inference Control in Statistical Databases, Springer, vol. 2316 of Lecture Notes in Computer Science, pp. 97–116.

R. J. CARROLL, D. RUPPERT, L. STEFANSKI, C. CRAINICEANU (2006). Measurement Error in Nonlinear Models: a Modern Perspective. Chapman & Hall, CRC, 2nd ed.

T. DALENIUS, S. P. REISS (1982). Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6, no. 1, pp. 73 – 85.

G. DATTA (2009). Model-based approach to small area estimation. Handbook of Statistics: Sample Surveys: Inference and Analysis, Volume 29B, Eds.: D. Pfeffermann and C.R. Rao. The Netherlands: North-Holland, pp. 251–288.

R. FAY, R. HERRIOT (1979). Estimates of income for small places: an application of James-Stein procedures to census data. Journal of the American Statistical Association, 74, pp. 269–277.

W. A. FULLER (1993). Masking procedures for microdata disclosure limitation. Journal of Official Statistics, 9, pp. 383–406.

M. GHOSH, K. SINHA, D. KIM (2006). Empirical and hierarchical Bayesian estimation in finite population sampling under structural measurement error model. Scandinavian Journal of Statistics, 33, pp. 591–568.

J. GOUWELEEUW, P. KOOIMAN, L. WILLENBORG, P.-P. DE WOLF (1998). Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics, 14, pp. 463–478.

J. KIM (1986). A method for limiting disclosure of microdata based on random noise and transformation. In Proceedings of the Survey Research Methods Section, American Statistical Association, pp. 370–374.

R. J. A. LITTLE (1993). Statistical analysis of masked data. Journal of Official Statistics, 9, pp. 407–426.

D. PFEFFERMAN (2013). New important developments in small area estimation. Statistical Science, 28, pp. 40–68.

J. N. K. RAO (2003). Small Area Estimation. Wiley series in survey methodology. John Wiley and Sons, New York.

N. SHLOMO, C. SKINNER (2010). Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata. Ann. Appl. Stat., 4, no. 3, pp. 1291–1310.

L. WILLENBORG, T. DE WAAL (2001). Elements of Statistical disclosure control. Springer, New York.

Y. WOO, A. SLAVKOVIĆ (2012). Logistic regression with variables subject to post randomization method. In J. Domingo-Ferrer, I. Tinnirello (eds.), Privacy in Statistical Databases, Springer Berlin Heidelberg, vol. 7556 of Lecture Notes in Computer Science, pp. 116–130.

L. YBARRA, S. LOHR (2008). Small area estimation when auxiliary information is measured with error. Biometrika, 95, pp. 919–931.




How to Cite

Polettini, S., & Arima, S. (2015). Small area estimation with covariates perturbed for disclosure limitation. Statistica, 75(1), 57–72.