KFC: A Scalable Approximation Algorithm for $k$−center Fair Clustering
Authors: Elfarouk Harb, Ho Shan Lam
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on 6 real-world datasets: reuters, victorian, 4area from [2], and bank, census, creditcard from [5]. Cost. Defined as the k center cost, or the maximum distance from a point to its nearest center reported by the algorithm. Runtime. We report the runtime in seconds that the algorithm take to terminate. If any algorithm takes longer than 30 minutes per run, we report a Time limit exceeded (TLE). Additive Violation of fairness constraint. We report the maximum additive violation E in any cluster output by all algorithms. |
| Researcher Affiliation | Collaboration | Elfarouk Harb Independent Researcher Hong Kong eyfmharb@gmail.com; Lam Ho Shan Hong Kong University of Science and Technology Hong Kong hslamaa@connect.ust.hk |
| Pseudocode | Yes | Algorithm 1 Fair k clustering |
| Open Source Code | Yes | The implementation for our algorithm and Ahmadian et al. [2] is available at [14]. [14] E. Harb and H. S. Lam. Scalable approximation algorithm for fair k-center clustering. https: //github.com/Farouk Y/KFC-Scalable Fair Clustering, 2020. |
| Open Datasets | Yes | We conduct experiments on 6 real-world datasets: reuters, victorian, 4area from [2], and bank, census, creditcard from [5]. |
| Dataset Splits | No | The paper states 'We conduct experiments on 6 real-world datasets' but does not explicitly provide specific train/validation/test dataset splits, percentages, or explicit methodologies for data partitioning. |
| Hardware Specification | Yes | All the computations are run independently on a Macbook Pro with a 2.4 GHz Quad-Core Intel Core i5 processor. |
| Software Dependencies | No | All algorithms are written in Python 3. For our algorithm and [2], we use the CPLEX solver to handle the linear programs in our implementation which drastically improves the solution speed, with the linear programs defined using the library Pulp [22]. The text mentions Python 3 (a specific version for the language) but does not provide explicit version numbers for CPLEX or Pulp libraries, which are key software components. |
| Experiment Setup | Yes | The first experiment is a comparison with the baselines with respect to the fairness constraint α, where the parameters are fixed as k = 25, ϵ = 0.1, β = 0, and α s with feasible solutions are selected. We use α = 0.5, ϵ = 0.1 on reuters and victorian with varying k. |