DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation

Authors: Yinjun Wu, Mayank Keoliya, Kan Chen, Neelay Velingker, Ziyang Li, Emily J Getzen, Qi Long, Mayur Naik, Ravi B Parikh, Eric Wong

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best blackbox models while providing faithful explanations.
Researcher Affiliation Academia 1School of Computer Science, Peking University, Beijing, China 2Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States 3School of Public Health, Harvard University, Boston, MA, United States 4Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
Pseudocode Yes Algorithm 1 The overview of Deep Q-Learning (DQL) algorithm for rule learning in DISCRET
Open Source Code Yes DISCRET is available at https://github.com/ wuyinjun-1993/DISCRET-ICML2024.
Open Datasets Yes Specifically, we select IHDP (Hill, 2011), TCGA (Weinstein et al., 2013) IHDP-C (a variant of IHDP), and News for tabular setting, the Enriched Equity Evaluation Corpus (EEEC) dataset (Kiritchenko & Mohammad, 2018) for text setting and Uganda (Jerzak et al., 2023b;a) dataset for the image setting.
Dataset Splits No No explicit mention of validation dataset splits (e.g., specific percentages or sample counts for a validation set) was found. The paper refers to "in-sample and out-of-sample ϵAT E" and evaluation on "training set and test set" but does not detail a separate validation split.
Hardware Specification No No specific hardware details (e.g., GPU model, CPU type, memory) used for running the experiments were found. The paper makes general references to training models but does not specify the computational resources.
Software Dependencies No No specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') were explicitly stated.
Experiment Setup Yes For both variants, we perform grid search on the number of conjunctions, K, and the number of disjunctions, H, and the regularization coefficient λ, in which K {2, 4, 6}, H {1, 3} and λ {0, 2, 5, 8, 10}.