A survey on measuring indirect discrimination in machine learning

31 Oct 2015  ·  Indre Zliobaite ·

Nowadays, many decisions are made using predictive models built on historical data.Predictive models may systematically discriminate groups of people even if the computing process is fair and well-intentioned. Discrimination-aware data mining studies how to make predictive models free from discrimination, when historical data, on which they are built, may be biased, incomplete, or even contain past discriminatory decisions. Discrimination refers to disadvantageous treatment of a person based on belonging to a category rather than on individual merit. In this survey we review and organize various discrimination measures that have been used for measuring discrimination in data, as well as in evaluating performance of discrimination-aware predictive models. We also discuss related measures from other disciplines, which have not been used for measuring discrimination, but potentially could be suitable for this purpose. We computationally analyze properties of selected measures. We also review and discuss measuring procedures, and present recommendations for practitioners. The primary target audience is data mining, machine learning, pattern recognition, statistical modeling researchers developing new methods for non-discriminatory predictive modeling. In addition, practitioners and policy makers would use the survey for diagnosing potential discrimination by predictive models.

PDF Abstract


  Add Datasets introduced or used in this paper