We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DB

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Databases

Title: Happiness Maximizing Sets under Group Fairness Constraints (Technical Report)

Abstract: A happiness maximizing set (HMS), which is a small subset of tuples selected from a database to preserve the best score with respect to any nonnegative linear utility function, is an important problem in multi-criteria decision-making. When an HMS is extracted from a database of individuals for assisting data-driven algorithmic decisions such as hiring and admission, it is crucial to ensure that the HMS can fairly represent different groups of candidates without any form of bias and discrimination. However, although the HMS problem was extensively studied in the database community, existing algorithms do not take \emph{group fairness} into account and may provide solutions that under-represent some of the groups.
In this paper, we propose and investigate a fair variant of HMS (FairHMS) that not only maximizes the minimum happiness ratio but also guarantees that the number of tuples chosen from each group falls within predefined lower and upper bounds. Similar to the vanilla HMS problem, we show that FairHMS is NP-hard in three and higher dimensions. Therefore, we first propose an exact interval cover-based algorithm called \textsc{IntCov} for FairHMS on two-dimensional databases. Then, we propose a bicriteria approximation algorithm called \textsc{BiGreedy} for FairHMS on multi-dimensional databases by transforming it into a submodular maximization problem under a matroid constraint. We also design an adaptive sampling strategy to improve the practical efficiency of \textsc{BiGreedy}. Extensive experiments on real-world and synthetic datasets confirm the efficacy and efficiency of our proposal.
Comments: technical report under review
Subjects: Databases (cs.DB); Computers and Society (cs.CY); Data Structures and Algorithms (cs.DS)
Cite as: arXiv:2208.06553 [cs.DB]
  (or arXiv:2208.06553v2 [cs.DB] for this version)

Submission history

From: Yanhao Wang [view email]
[v1] Sat, 13 Aug 2022 02:54:29 GMT (2430kb)
[v2] Wed, 17 Aug 2022 07:35:38 GMT (4749kb,D)

Link back to: arXiv, form interface, contact.