Approximate Group Fairness for Clustering
Authors: Bo Li, Lijun Li, Ankang Sun, Chenhao Wang, Yingfan Wang
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments Finally, in Section 6, we conduct experiments to examine the performance of our algorithms. We note that our algorithms have good theoretical guarantees in the worst case, but they may not find the fairest clustering for every instance. Accordingly, we first propose a twostage algorithm to refine the clusters and then use synthetic and real-world data sets to show how much it outperforms classic ones regarding core fairness. |
| Researcher Affiliation | Academia | 1Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China 2School of Mathematical Sciences, Ocean University of China, Qingdao, China 3Warwick Business School, University of Warwick, United Kingdom 4University of Nebraska-Lincoln, United States 5Department of Computer Science, Duke University, United States. |
| Pseudocode | Yes | Algorithm 1 ALGl(λ) for Line. ... Algorithm 2 ALGt(λ) for Tree. ... Algorithm 3 ALGg for General Metric Space. ... Algorithm 4 ALG+ g (obj) for General Metric Space. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | (2) Mopsi locations in clustering benchmark datasets (Fr anti and Sieranoja, 2018) (real-world): a set of 2-D locations for n = 6014 users in Joensuu. ... Pasi Fr anti and Sami Sieranoja. 2018. K-means properties on six clustering benchmark datasets. Appl. Intell. 48, 12 (2018), 4743 4759. http://cs.uef.fi/sipu/ datasets/ |
| Dataset Splits | No | The paper describes the datasets used and the range of k values for clustering, but it does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper describes the algorithms and experiments but does not provide any specific details about the hardware specifications (e.g., CPU, GPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions implementing algorithms like k-means++ but does not specify any software dependencies or libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed for replication. |
| Experiment Setup | Yes | For a range of k = 8, . . . , 17 (horizontal axis), (c) and (d) (resp. (e) and (f)) compare the fairness and efficiency in Gaussian dataset (resp. Mopsi locations). We want to build k = 10 centers to serve the nodes; |