Entropy Methods for the Confidence Assessment of Probabilistic Classification Models


  • Gabriele Nunzio Tornetta




Machine-learning, Naive-Bayes, Uncertainty, Classification


Many classification models produce a probability distribution as the outcome of a prediction. This information is generally compressed down to the single class with the highest associated probability. In this paper we argue that part of the information that is discarded in this process can be in fact used to further evaluate the goodness of models, and in particular the confidence with which each prediction is made. As an application of the ideas presented in this paper, we provide a theoretical explanation of a confidence degradation phenomenon observed in the complement approach to the (Bernoulli) Naïve Bayes generative model.


P. DOMINGOS (1999). MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, USA, KDD ’99, pp. 155–164.

T. FAWCETT (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, no. 8, pp. 861–874.

C. D. MANNING, P. RAGHAVAN, H. SCHÜTZE (2008). Introduction to Information Retrieval. Cambridge University Press, USA.

F. PEDREGOSA, G. VAROQUAUX, A. GRAMFORT, V. MICHEL, B. THIRION, O. GRISEL, M. BLONDEL, P. PRETTENHOFER, R. WEISS, V. DUBOURG, J. VANDERPLAS, A. PASSOS, D. COURNAPEAU, M. BRUCHER, M. PERROT, E. DUCHESNAY (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, pp. 2825–2830.

J. D. M. RENNIE, L. SHIH, J. TEEVAN, D. R. KARGER (2003). Tackling the poor assumptions of naive Bayes text classifiers. In Proceedings of the Twentieth International Conference on International Conference on Machine Learning. AAAI Press, ICML’03, pp. 616–623.

G. RITSCHARD (2006). Computing and using the deviance with classification trees. In A. RIZZI, M. VICHI (eds.), Compstat 2006 - Proceedings in Computational Statistics. Physica-Verlag HD, Heidelberg, pp. 55–66.

V. SCHETININ, D. PARTRIDGE, W. J. KRZANOWSKI, R. M. EVERSON, J. E. FIELDSEND, T. C. BAILEY, A. HERNANDEZ (2004). Experimental comparison of classification uncertainty for randomised and Bayesian decision tree ensembles. In Z. R. YANG, H. YIN, R. M. EVERSON (eds.), Intelligent Data Engineering and Automated Learning – IDEAL 2004. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 726–732.

K. A. SPACKMAN (1989). Signal detection theory: valuable tools for evaluating inductive learning. In Proceedings of the Sixth International Workshop on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 160–163.

S. V. STEHMAN (1997). Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment, 62, no. 1, pp. 77–89.

R. TIBSHIRANI (1996). Bias, variance and prediction error for classification rules. Department of Statistics, University of Toronto, Canada.

X.-N.WANG, J.-M.WEI, H. JIN, G. YU, H.-W. ZHANG (2013). Probabilistic confusion entropy for evaluating classifiers. Entropy, 15, no. 12, pp. 4969–4992.

X. ZHANG, F. CHEN, C.-T. LU, N. RAMAKRISHNAN (2019). Mitigating uncertainty in document classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp. 3126–3136.




How to Cite

Tornetta, G. N. (2021). Entropy Methods for the Confidence Assessment of Probabilistic Classification Models. Statistica, 81(4), 383–398. https://doi.org/10.6092/issn.1973-2201/11479