On the Adversarial Robustness of Causal Algorithmic Recourse

Authors: Ricardo Dominguez-Olmedo, Amir H Karimi, Bernhard Schölkopf

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate their effectiveness on five tabular datasets, for linear and neural network classifiers. We present the experiment results in Figure 3. We present the experimental results in Figure 4. We empirically evaluate whether training the decision-making classifier with the proposed ALLR regularizer facilitates the existence of adversarially robust recourse.
Researcher Affiliation Academia 1Max Planck Institute for Intelligent Systems, Tübingen, Germany 2University of Tübingen, Germany 3ETH Zürich, Switzerland.
Pseudocode Yes Algorithm 1 Generate adversarially robust recourse for a differentiable classifier h and differentiable SCM M.
Open Source Code Yes We open source our implementations and experiments2. 2github.com/Ricardo Dominguez/Adversarially Robust Recourse
Open Datasets Yes We consider four real-world datasets and one semi-synthetic dataset. For the causal recourse setting, we consider the COMPAS recidivism dataset (Larson et al., 2016) and the Adult demographic dataset (Kohavi & Becker, 1996), for which we adopt the causal graphs assumed in Nabi & Shpitser (2018). We additionally consider one semi-synthetic SCM introduced by Karimi et al. (2020), which is inspired in a loan approval setting. For the non-causal recourse setting, we consider the South German Credit dataset (Groemping, 2019), as well as a recidivism dataset (Schmidt & Witte, 1988) from North Carolina which we refer to as Bail.
Dataset Splits No The paper states using an '80%-20% train-test split' and tuning epochs for 'best predictive performance', implying internal validation, but does not explicitly detail a separate validation split percentage or specific samples within the main text.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models, memory, or cloud computing instance types.
Software Dependencies No The paper mentions 'Adam (Kingma & Ba, 2015) as the optimizer' but does not specify version numbers for general software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, scikit-learn) that would be needed for replication.
Experiment Setup Yes We use Adam (Kingma & Ba, 2015) as the optimizer with a learning rate of 10 3 and a batch size of 100. To determine a suitable number of training epochs for each dataset and training objective, we train for 500 epochs and select the number of training epochs which leads to the best predictive performance in terms of accuracy and Mathews Correlation Coefficient (MCC). For ALLR with NN classifiers, we heuristically find that µ1 = 3.0 works well across all datasets. We additionally perform hyperparameter search over µ2 {0.01, 0.1, 0.5, 3.0}.