KFC: A Scalable Approximation Algorithm for $k$−center Fair Clustering

Authors: Elfarouk Harb, Ho Shan Lam

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on 6 real-world datasets: reuters, victorian, 4area from [2], and bank, census, creditcard from [5]. Cost. Defined as the k center cost, or the maximum distance from a point to its nearest center reported by the algorithm. Runtime. We report the runtime in seconds that the algorithm take to terminate. If any algorithm takes longer than 30 minutes per run, we report a Time limit exceeded (TLE). Additive Violation of fairness constraint. We report the maximum additive violation E in any cluster output by all algorithms.
Researcher Affiliation Collaboration Elfarouk Harb Independent Researcher Hong Kong eyfmharb@gmail.com; Lam Ho Shan Hong Kong University of Science and Technology Hong Kong hslamaa@connect.ust.hk
Pseudocode Yes Algorithm 1 Fair k clustering
Open Source Code Yes The implementation for our algorithm and Ahmadian et al. [2] is available at [14]. [14] E. Harb and H. S. Lam. Scalable approximation algorithm for fair k-center clustering. https: //github.com/Farouk Y/KFC-Scalable Fair Clustering, 2020.
Open Datasets Yes We conduct experiments on 6 real-world datasets: reuters, victorian, 4area from [2], and bank, census, creditcard from [5].
Dataset Splits No The paper states 'We conduct experiments on 6 real-world datasets' but does not explicitly provide specific train/validation/test dataset splits, percentages, or explicit methodologies for data partitioning.
Hardware Specification Yes All the computations are run independently on a Macbook Pro with a 2.4 GHz Quad-Core Intel Core i5 processor.
Software Dependencies No All algorithms are written in Python 3. For our algorithm and [2], we use the CPLEX solver to handle the linear programs in our implementation which drastically improves the solution speed, with the linear programs defined using the library Pulp [22]. The text mentions Python 3 (a specific version for the language) but does not provide explicit version numbers for CPLEX or Pulp libraries, which are key software components.
Experiment Setup Yes The first experiment is a comparison with the baselines with respect to the fairness constraint α, where the parameters are fixed as k = 25, ϵ = 0.1, β = 0, and α s with feasible solutions are selected. We use α = 0.5, ϵ = 0.1 on reuters and victorian with varying k.