Individual Fairness for k-Clustering
Authors: Sepideh Mahabadi, Ali Vakilian
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide an empirical evaluation of our local search based algorithm for α-fair k-clustering with k-median and k-means cost functions. We run experiments on three datasets (Diabetes, Bank, Census) that have been previously used in the context of fair clustering. |
| Researcher Affiliation | Academia | 1TTIC, Chicago, Illinois, USA 2University of Wisconsin, Madison, Wisconsin, USA. Correspondence to: Sepideh Mahabadi <mahabadi@ttic.edu>, Ali Vakilian <vakilian@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Finds a set of critical balls of size at most k; Algorithm 2 local search algorithm w.r.t. critical balls |
| Open Source Code | No | The paper does not provide any concrete access information such as a specific repository link, an explicit statement about code release, or mention of code in supplementary materials for the methodology described. |
| Open Datasets | Yes | We consider three datasets from UCI Machine Learning Repository (2017) which are standard benchmarks for clustering algorithms and in particular they were used in the context of fair k-median clustering in (Chierichetti et al., 2017; Chen et al., 2019; Backurs et al., 2019; Bera et al., 2019; Huang et al., 2019). |
| Dataset Splits | No | The paper mentions using subsets of data for experiments ('we randomly sample a subset of size 1000 points from the data set'), but it does not specify explicit training, validation, or test dataset splits or percentages. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, or specific solvers). |
| Experiment Setup | Yes | In our experiments, we follow the description of Algorithm 1 and 2. The only discrepancy is that instead of considering a point to be covered if it has a center within distance of 6 times its fair radius, in our implementation, we consider a point covered if it has a center within distance of 3 times its fair radius (see line 6 in Algorithm 1). In all experiments, the input parameter α to our local search algorithms (i.e., the desired fairness guarantee) is the fairness approximation 1 η 2 returned by the FAIRKCENTER algorithm of (Jung et al., 2019). Finally, we consider values of k to be in range 5 to 30 with steps of size 5 and draw our plots as a function of k. |