reproducibilityindex.ai

Counterfactual Metarules for Local and Global Recourse

Authors: Tom Bewley, Salim I. Amoukou, Saumitra Mishra, Daniele Magazzeni, Manuela Veloso

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce T-CREx, a novel model-agnostic method for local and global counterfactual explanation (CE), which summarises recourse options for both individuals and groups in the form of human-readable rules. It leverages tree-based surrogate models to learn the counterfactual rules, alongside metarules denoting their regions of optimality, providing both a global analysis of model behaviour and diverse recourse options for users. Experiments indicate that T-CREx achieves superior aggregate performance over existing rulebased baselines on a range of CE desiderata, while being orders of magnitude faster to run.
Researcher Affiliation	Industry	1J.P. Morgan AI Research. Correspondence to: Tom Bewley <tom.bewley@jpmorgan.com>. Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by the author(s).
Pseudocode	No	The T-CREx algorithm is described in seven steps (a-g) with accompanying explanations and a visual representation in Figure 2. However, these steps are presented as descriptive paragraphs rather than a formal pseudocode block or algorithm listing.
Open Source Code	No	The paper mentions using existing public Python implementations for baseline methods (e.g., "We use a third-party implementation by Kanamori et al. (2022) at https://github.com/kelicht/cet"). However, it does not provide a link to or explicitly state that the source code for their own method, T-CREx, is open-source or publicly available.
Open Datasets	Yes	Our experimental setup is somewhat inspired by that used in the RF-OCSE paper (Fern andez et al., 2020), and we retain eight out of the 10 public-access datasets studied there. ... Details of the 10 datasets used in our experiments are as follows (# Inst. = number of instances, # Feat. = number of features, # Cat. = number of categorical features, Class Balance = proportion of instances with the positive class): Abalone Abalone (Nash et al., 1995) Adult Adult / Census Income (Becker & Kohavi, 1996) Banknote Banknote Authentication (Lohweg, 2013) COMPAS COMPAS Recidivism Racial Bias (Pro Publica, 2016) Credit Default of Credit Card Clients (Yeh, 2016) HELOC Home Equity Line of Credit (FICO, 2018) Mamm. Mass Mammographic Mass (Elter, 2007) Occupancy Occupancy Detection (Candanedo, 2016) Pima Pima Indians Diabetes (Smith et al., 1988) Wine Quality Wine Quality (Cortez et al., 2009)
Dataset Splits	Yes	We run this experiment on nine binary classification datasets (details in Appendix C), using 10-fold cross-validation (CV) to split the datasets into train and test components (D, Dtest), and aggregate results across all folds.
Hardware Specification	Yes	Finally, we report the runtime (on an r6i.large AWS instance) of all algorithmic variants and baselines.
Software Dependencies	No	The paper mentions using specific software components like "XGBoost model (Chen & Guestrin, 2016)" and "sklearn.neural network.MLPClassifier" for the target models. However, it does not provide specific version numbers for these or other software libraries and dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	Throughout this section, f is an XGBoost model (Chen & Guestrin, 2016), trained on D with n estimators=50 and max leaves=8. ... We begin by characterising the performance of T-CREx as a function of key hyperparameters, specifically the number of trees in the surrogate model ( {1, 2, 3, 5, 10, 20}) and the accuracy threshold τ ( {0.8, 0.9, 0.95, 0.98, 0.99}) while holding the feasibility threshold constant at ρ = 0.02.