Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

Evgenii Chzhen; Christophe Denis; Mohamed Hebiri; Luca Oneto; Massimiliano Pontil

Communication Dans Un Congrès Année : 2019

Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

(1, 2, 3) , (1) , (1) , (4) , (5, 6)

1
2
3
4
5
6

Evgenii Chzhen

Fonction : Auteur
PersonId : 169831
IdHAL : echzhen
ORCID : 0009-0003-3065-4267

Laboratoire Analyse et de Mathématiques Appliquées

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Christophe Denis

Fonction : Auteur
PersonId : 1036220
IdHAL : christophedenisuge

Laboratoire Analyse et de Mathématiques Appliquées

Mohamed Hebiri

Fonction : Auteur
PersonId : 1048170

Laboratoire Analyse et de Mathématiques Appliquées

Luca Oneto

Fonction : Auteur

University of Pisa - Università di Pisa

Massimiliano Pontil

Fonction : Auteur

Istituto Italiano di Tecnologia

University College of London [London]

Résumé

We study the problem of fair binary classification using the notion of Equal Opportunity. It requires the true positive rate to distribute equally across the sensitive groups. Within this setting we show that the fair optimal classifier is obtained by recalibrating the Bayes classifier by a group-dependent threshold. We provide a constructive expression for the threshold. This result motivates us to devise a plug-in classification procedure based on both unlabeled and labeled datasets. While the latter is used to learn the output conditional probability, the former is used for calibration. The overall procedure can be computed in polynomial time and it is shown to be statistically consistent both in terms of the classification error and fairness measure. Finally, we present numerical experiments which indicate that our method is often superior or competitive with the state-of-the-art methods on benchmark datasets.

Mots clés

Plug-in classifiers Semi-supervised classification Fairness equality of opportunities

Domaines

Statistiques [math.ST] Théorie [stat.TH] Machine Learning [stat.ML]

Fichier principal

main.pdf (1.19 Mo)

Mohamed Hebiri : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02150662

Soumis le : lundi 3 février 2020-09:51:30

Dernière modification le : mercredi 3 avril 2024-14:16:03

Dates et versions

hal-02150662 , version 1 (07-06-2019)

hal-02150662 , version 2 (03-02-2020)

Identifiants

HAL Id : hal-02150662 , version 2
ARXIV : 1906.05082

Citer

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil. Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification. NeurIPS 2019 - 33th Annual Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada. ⟨hal-02150662v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA LAMA_UMR8050 LM-ORSAY LAMA_PS UPEC INRIA2 UNIV-PARIS-SACLAY GS-MATHEMATIQUES GS-COMPUTER-SCIENCE UNIV-EIFFEL

200 Consultations

338 Téléchargements

Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager