Causal Proxy Models for Concept-based Model Explanations
Authors: Zhengxuan Wu, Karel D’Oosterlinck, Atticus Geiger, Amir Zur, Christopher Potts
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate these methods on the CEBa B benchmark for causal explanation methods (Abraham et al., 2022), which provides large numbers of original examples (restaurant reviews) with human-created counterfactuals for specific concepts (e.g., service quality), with all the texts labeled for their concept-level and text-level sentiment. This counterfactual data is used to uncover the true counterfactual behavior of a model, against which a causal explanation of the model can be benchmarked. |
| Researcher Affiliation | Academia | 1Stanford University, Stanford, California 2Ghent University imec, Ghent, Belgium. |
| Pseudocode | No | The paper describes methods using text and mathematical formulas but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/frankaging/ Causal-Proxy-Model. |
| Open Datasets | Yes | We evaluate these methods on the CEBa B benchmark for causal explanation methods (Abraham et al., 2022) |
| Dataset Splits | Yes | The groups of originals and corresponding approximate counterfactuals are partitioned over train/dev/test. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or cloud instance specifications. |
| Software Dependencies | No | Our models are all implemented in Py Torch (Paszke et al., 2019) and using the Hugging Face library (Wolf et al., 2019). While software is mentioned, specific version numbers (e.g., PyTorch 1.9) are not provided. |
| Experiment Setup | Yes | CPMIN: The maximum number of training epochs is set to 30 with a learning rate of 5e 5 and an effective batch size of 128. The learning rate linearly decays to 0 over the 30 training epochs. We employ an early stopping strategy for COSICa CE over the dev set for an interval of 50 steps with early stopping patience set to 20. We set the max sequence length to 128 and the dropout rate to 0.1. We take a weighted sum of two objectives as the loss term for training CPMHI. Specifically, we use [w Mimic, w IN] = [1.0, 3.0]. For the smoothed cross-entropy loss, we use a temperature of 2.0. |