Doubly Constrained Fair Clustering

Authors: John Dickerson, Seyed Esmaeili, Jamie H. Morgenstern, Claire Jie Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we carry experiments to validate our theoretical findings. We conduct experiments over datasets from the UCI repository [23] to validate our theoretical findings.
Researcher Affiliation Academia John Dickerson1,2, Seyed A. Esmaeili3, Jamie Morgenstern4, and Claire Jie Zhang4 1University of Maryland, College Park 2Arthur 3Simons Laufer Mathematical Sciences Institute 4University of Washington
Pseudocode Yes Algorithm 1 DIVIDE, Algorithm 2 DSTOGF+DS, Algorithm 3 ASSIGNMENTGF
Open Source Code No The paper does not provide a direct link or explicit statement about the public availability of the source code for the methodology described.
Open Datasets Yes We conduct experiments over datasets from the UCI repository [23] to validate our theoretical findings. Specifically, we use the Adult dataset sub-sampled to 20,000 records. Gender is used for group membership while the numeric entries are used to form a point (vector) for each record. We use the Euclidean distance.
Dataset Splits No The paper specifies using sub-sampled datasets (e.g., 'Adult dataset sub-sampled to 20,000 records' and 'subsample 6,000 records from the dataset' for Census1990) but does not provide details on training, validation, or test splits, nor does it mention cross-validation.
Hardware Specification Yes We use commdity hardware, specifically a Mac Book Pro with an Apple M2 chip.
Software Dependencies Yes We use Python 3.9, the CPLEX package [38] for solving linear programs and Network X [27] for max-flow rounding. Further, Scikit-learn is used for some standard ML related operations.
Experiment Setup Yes Further, for the GF constraints we set the lower and upper proportion bounds to βh = (1 δ)rh and αh = (1 + δ)rh for each color h where rh is color h s proportion in the dataset and we set δ = 0.2. For the DS constraints, since we do not deal with a large number of centers we set kl h = 0.8rhk and ku h = rhk.