reproducibilityindex.ai

Sample Selection for Fair and Robust Training

Authors: Yuji Roh, Kangwook Lee, Steven Whang, Changho Suh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our algorithm obtains fairness and robustness that are better than or comparable to the state-of-the-art technique, both on synthetic and benchmark real datasets.
Researcher Affiliation	Academia	Yuji Roh KAIST yuji.roh@kaist.ac.kr Kangwook Lee University of Wisconsin-Madison kangwook.lee@wisc.edu Steven Euijong Whang KAIST swhang@kaist.ac.kr Changho Suh KAIST chsuh@kaist.ac.kr
Pseudocode	Yes	Algorithm 1: Greedy-Based Clean and Fair Sample Selection
Open Source Code	No	The paper does not include an explicit statement or a direct link to the source code for the methodology described.
Open Datasets	Yes	We utilize two real datasets, Pro Publica COMPAS [Angwin et al., 2016] and Adult Census [Kohavi, 1996], and use the same pre-processing in IBM Fairness 360 [Bellamy et al., 2019].
Dataset Splits	No	The paper mentions evaluating on "separate clean test sets" and minibatches, but does not specify the train/validation/test splits (e.g., percentages or sample counts) for its experiments.
Hardware Specification	Yes	all experiments are run on Intel Xeon Silver 4210R CPUs and NVIDIA Quadro RTX 8000 GPUs.
Software Dependencies	Yes	Our algorithm and Fair Batch are implemented in PyTorch [Paszke et al., 2019] with Python 3.8.
Experiment Setup	Yes	We train our models using stochastic gradient descent (SGD) with a batch size of 200, momentum of 0.9, and learning rate of 0.001 for 3,000 epochs.