We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DB

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Databases

Title: Happiness Maximizing Sets under Group Fairness Constraints (Technical Report)

Abstract: Finding a happiness maximizing set (HMS) from a database, i.e., selecting a small subset of tuples that preserves the best score with respect to any nonnegative linear utility function, is an important problem in multi-criteria decision-making. When an HMS is extracted from a set of individuals to assist data-driven algorithmic decisions such as hiring and admission, it is crucial to ensure that the HMS can fairly represent different groups of candidates without bias and discrimination. However, although the HMS problem was extensively studied in the database community, existing algorithms do not take group fairness into account and may provide solutions that under-represent some groups.
In this paper, we propose and investigate a fair variant of HMS (FairHMS) that not only maximizes the minimum happiness ratio but also guarantees that the number of tuples chosen from each group falls within predefined lower and upper bounds. Similar to the vanilla HMS problem, we show that FairHMS is NP-hard in three and higher dimensions. Therefore, we first propose an exact interval cover-based algorithm called IntCov for FairHMS on two-dimensional databases. Then, we propose a bicriteria approximation algorithm called BiGreedy for FairHMS on multi-dimensional databases by transforming it into a submodular maximization problem under a matroid constraint. We also design an adaptive sampling strategy to improve the practical efficiency of BiGreedy. Extensive experiments on real-world and synthetic datasets confirm the efficacy and efficiency of our proposal.
Comments: Technical report, a shorter version to appear in PVLDB 16(2)
Subjects: Databases (cs.DB); Computers and Society (cs.CY); Data Structures and Algorithms (cs.DS)
DOI: 10.14778/3565816.3565830
Cite as: arXiv:2208.06553 [cs.DB]
  (or arXiv:2208.06553v3 [cs.DB] for this version)

Submission history

From: Yanhao Wang [view email]
[v1] Sat, 13 Aug 2022 02:54:29 GMT (2430kb)
[v2] Wed, 17 Aug 2022 07:35:38 GMT (4749kb,D)
[v3] Sat, 8 Oct 2022 08:46:52 GMT (4709kb,D)

Link back to: arXiv, form interface, contact.