reproducibilityindex.ai

Classification Under Strategic Self-Selection

Authors: Guy Horowitz, Yonatan Sommer, Moran Koren, Nir Rosenfeld

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conclude with experiments on real data and simulated behavior that complement our analysis and demonstrate the utility of our approach. We conclude with an empirical demonstration of our approach in a semi-synthetic experimental setting that uses real data and simulated self-selective behavior. We now turn to our experimental analysis based on real data and simulated self-selective behavior.
Researcher Affiliation	Academia	1Faculty of Computer Science, Technion Israel Institute of Technology 2Department of Industrial Engineering and Management, Ben Gurion University. Correspondence to: Nir Rosenfeld <nirr@cs.technion.ac.il>.
Pseudocode	No	The paper does not contain any explicit sections or figures labeled as "Pseudocode" or "Algorithm," nor does it present structured steps in a code-like format.
Open Source Code	Yes	Code is publicly available at https://github.com/Ysommer/GKSC-ICML.
Open Datasets	Yes	We use two public datasets: (i) adult and (ii) bank, both of which are publicly available, commonly used for evaluation in the fairness literature (Le Quy et al., 2022), and appropriate for our setting. The data is publicly available at https://archive.ics.uci.edu/dataset/2/adult. The data is publicly available at https://www.kaggle.com/datasets/prakharrathi25/ banking-dataset-marketing-targets.
Dataset Splits	Yes	Data is split 70-30 into train and test sets. Overall we did not see evidence of overfitting, and hence had no need for a validation set.
Hardware Specification	Yes	The main experiment was run on a CPU cluster of AMD EPYC 7713 machines (1.6 Ghz, 256M, 128 cores).
Software Dependencies	No	All code was implemented in python, and the learning framework was implemented using Pytorch. The paper mentions software names but does not provide specific version numbers for Python, PyTorch, or any other libraries or dependencies used, which are necessary for full reproducibility.
Experiment Setup	Yes	We used vanilla gradient descent with learning rate 0.1 and trained for a predetermined and fixed number of epochs. Hyperparameters: Temperature τapp for the application sigmoid ξ in Eq. (15): 5; Temperature τprec for the precision proxy g prc in Eq. (17): 5; Temperature τ for standard sigmoid σ: 2; Cost tolerance ε: 0.02 for adult, 0.05 for bank. Regularization coefficient λapp for Rapp in Eq. (18): 1/6 for strat and stratˆy z for adult, 1/6 for strat and 1/64 for stratˆy z for bank. Regularization coefficient λ for R in Eq. (19) (used only for stratˆy z): For adult, we set 8 for the lowest cost c = 0.65, and increasing linearly up to 16 up to c = 0.85. For bank we used 100.15.