reproducibilityindex.ai

Approximate Group Fairness for Clustering

Authors: Bo Li, Lijun Li, Ankang Sun, Chenhao Wang, Yingfan Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments Finally, in Section 6, we conduct experiments to examine the performance of our algorithms. We note that our algorithms have good theoretical guarantees in the worst case, but they may not ﬁnd the fairest clustering for every instance. Accordingly, we ﬁrst propose a twostage algorithm to reﬁne the clusters and then use synthetic and real-world data sets to show how much it outperforms classic ones regarding core fairness.
Researcher Affiliation	Academia	1Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China 2School of Mathematical Sciences, Ocean University of China, Qingdao, China 3Warwick Business School, University of Warwick, United Kingdom 4University of Nebraska-Lincoln, United States 5Department of Computer Science, Duke University, United States.
Pseudocode	Yes	Algorithm 1 ALGl(λ) for Line. ... Algorithm 2 ALGt(λ) for Tree. ... Algorithm 3 ALGg for General Metric Space. ... Algorithm 4 ALG+ g (obj) for General Metric Space.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	Yes	(2) Mopsi locations in clustering benchmark datasets (Fr anti and Sieranoja, 2018) (real-world): a set of 2-D locations for n = 6014 users in Joensuu. ... Pasi Fr anti and Sami Sieranoja. 2018. K-means properties on six clustering benchmark datasets. Appl. Intell. 48, 12 (2018), 4743 4759. http://cs.uef.fi/sipu/ datasets/
Dataset Splits	No	The paper describes the datasets used and the range of k values for clustering, but it does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The paper describes the algorithms and experiments but does not provide any specific details about the hardware specifications (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies	No	The paper mentions implementing algorithms like k-means++ but does not specify any software dependencies or libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed for replication.
Experiment Setup	Yes	For a range of k = 8, . . . , 17 (horizontal axis), (c) and (d) (resp. (e) and (f)) compare the fairness and efﬁciency in Gaussian dataset (resp. Mopsi locations). We want to build k = 10 centers to serve the nodes;