Fair Off-Policy Learning from Observational Data

Authors: Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our framework through extensive numerical experiments using both simulated and real-world data.
Researcher Affiliation Academia 1LMU Munich 2Munich Center for Machine Learning.
Pseudocode Yes Algorithm 1 provides the learning algorithm for Fair Pol.
Open Source Code Yes 2Code is available at https://github.com/Dennis Frauen/Fair Pol.git.
Open Datasets Yes We use medical data from the Oregon health insurance experiment (Finkelstein et al., 2012). The Oregon health insurance experiment took place in 2008. ... 6The dataset is available here: https://www.nber.org/programs-projects/projects-and-centers/oregon-health-insurance-experiment
Dataset Splits Yes We first split the data into a training and validation set, and we then perform hyperparameter tuning using a grid search. All evaluations are based on the test set so that we capture the out-of-sample performance on unseen data. Additional details for our framework are in Appendix F. ... We split the data into a training set (80%) and a validation set (10%).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software like "Adam (Kingma & Ba, 2015)" and "TARNet (Shalit et al., 2017)" but does not specify version numbers for these or other software libraries/dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We followed best practices in causal machine learning (e.g., Bica et al., 2021; Curth & van der Schaar, 2021) and performed extensive hyperparameter tuning for Fair Pol. We split the data into a training set (80%) and a validation set (10%). We then performed 30 random grid search iterations and chose the set of parameters that minimized the respective training loss on the validation set. ... The tuning ranges for the hyperparameter are shown in Table 5 (simulated data) and Table 6 (real-world data). ... In our Fair Pol implementation, we use feed-forward neural networks with dropout and exponential linear unit activation functions for the base representation network, the outcome prediction network, and the sensitive attribute network. We use Adam (Kingma & Ba, 2015) for the optimization in both Steps 1 and 2.