BayCon: Model-agnostic Bayesian Counterfactual Generator
Authors: Piotr Romashov, Martin Gjoreski, Kacper Sokol, Maria Vanina Martinez, Marc Langheinrich
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the advantages of our method through a collection of experiments based on six real-life datasets representing three regression and three classification tasks. |
| Researcher Affiliation | Academia | Piotr Romashov1*, Martin Gjoreski1*, Kacper Sokol2, Maria Vanina Martinez3, Marc Langheinrich1 1Università della Svizzera italiana, Switzerland 2RMIT University, Australia 3Universidad de Buenos Aires, Argentina |
| Pseudocode | Yes | Algorithm 1 Bay Con. |
| Open Source Code | Yes | Bay Con implementation and the experimentation code, including processed datasets and analysis of the results, are freely available on Git Hub3. 3 https://github.com/piotromashov/baycon |
| Open Datasets | Yes | All the datasets are available online; the Bike dataset can be downloaded from the UCI repository and the other datasets are available through the Open ML repository [Vanschoren et al. 2014]. |
| Dataset Splits | No | The models were trained with all the data, excluding the explained instances. Since SVMs can be sensitive to feature scaling and model parameterisation, we applied min max normalisation to the input features and tuned the model parameters using 3fold cross-validation on the training data. This does not provide a general train/validation/test split for the entire dataset used for the main experiments. |
| Hardware Specification | Yes | All the experiments were run on a 3.70GHz Intel Core i9 CPU with 128GB of RAM. |
| Software Dependencies | No | Our method is implemented in Python 3.6 and relies heavily on scikit-learn [Pedregosa et al., 2011]. This does not provide a specific version number for scikit-learn. |
| Experiment Setup | Yes | For each classification dataset we selected 10 random instances to be explained, generating their counterfactual explanations 3 times to account for randomness... For each regression dataset, we selected 3 initial instances... Next, we generated explanations for 4 desired target ranges... The maximum number of iterations was set to 100. In our experiments, the constant that controls the trade-off between global search and local optimisation (i.e., exploration/exploitation) is set to ξ = 0.01... We used k = 100 for our experiments, which provides the minimum difference of 1% relative to the attribute range. |