Individual Fairness for k-Clustering

Authors: Sepideh Mahabadi, Ali Vakilian

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide an empirical evaluation of our local search based algorithm for α-fair k-clustering with k-median and k-means cost functions. We run experiments on three datasets (Diabetes, Bank, Census) that have been previously used in the context of fair clustering.
Researcher Affiliation Academia 1TTIC, Chicago, Illinois, USA 2University of Wisconsin, Madison, Wisconsin, USA. Correspondence to: Sepideh Mahabadi <mahabadi@ttic.edu>, Ali Vakilian <vakilian@mit.edu>.
Pseudocode Yes Algorithm 1 Finds a set of critical balls of size at most k; Algorithm 2 local search algorithm w.r.t. critical balls
Open Source Code No The paper does not provide any concrete access information such as a specific repository link, an explicit statement about code release, or mention of code in supplementary materials for the methodology described.
Open Datasets Yes We consider three datasets from UCI Machine Learning Repository (2017) which are standard benchmarks for clustering algorithms and in particular they were used in the context of fair k-median clustering in (Chierichetti et al., 2017; Chen et al., 2019; Backurs et al., 2019; Bera et al., 2019; Huang et al., 2019).
Dataset Splits No The paper mentions using subsets of data for experiments ('we randomly sample a subset of size 1000 points from the data set'), but it does not specify explicit training, validation, or test dataset splits or percentages.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, or specific solvers).
Experiment Setup Yes In our experiments, we follow the description of Algorithm 1 and 2. The only discrepancy is that instead of considering a point to be covered if it has a center within distance of 6 times its fair radius, in our implementation, we consider a point covered if it has a center within distance of 3 times its fair radius (see line 6 in Algorithm 1). In all experiments, the input parameter α to our local search algorithms (i.e., the desired fairness guarantee) is the fairness approximation 1 η 2 returned by the FAIRKCENTER algorithm of (Jung et al., 2019). Finally, we consider values of k to be in range 5 to 30 with steps of size 5 and draw our plots as a function of k.