reproducibilityindex.ai

BayCon: Model-agnostic Bayesian Counterfactual Generator

Authors: Piotr Romashov, Martin Gjoreski, Kacper Sokol, Maria Vanina Martinez, Marc Langheinrich

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the advantages of our method through a collection of experiments based on six real-life datasets representing three regression and three classification tasks.
Researcher Affiliation	Academia	Piotr Romashov1, Martin Gjoreski1, Kacper Sokol2, Maria Vanina Martinez3, Marc Langheinrich1 1Università della Svizzera italiana, Switzerland 2RMIT University, Australia 3Universidad de Buenos Aires, Argentina
Pseudocode	Yes	Algorithm 1 Bay Con.
Open Source Code	Yes	Bay Con implementation and the experimentation code, including processed datasets and analysis of the results, are freely available on Git Hub3. 3 https://github.com/piotromashov/baycon
Open Datasets	Yes	All the datasets are available online; the Bike dataset can be downloaded from the UCI repository and the other datasets are available through the Open ML repository [Vanschoren et al. 2014].
Dataset Splits	No	The models were trained with all the data, excluding the explained instances. Since SVMs can be sensitive to feature scaling and model parameterisation, we applied min max normalisation to the input features and tuned the model parameters using 3fold cross-validation on the training data. This does not provide a general train/validation/test split for the entire dataset used for the main experiments.
Hardware Specification	Yes	All the experiments were run on a 3.70GHz Intel Core i9 CPU with 128GB of RAM.
Software Dependencies	No	Our method is implemented in Python 3.6 and relies heavily on scikit-learn [Pedregosa et al., 2011]. This does not provide a specific version number for scikit-learn.
Experiment Setup	Yes	For each classification dataset we selected 10 random instances to be explained, generating their counterfactual explanations 3 times to account for randomness... For each regression dataset, we selected 3 initial instances... Next, we generated explanations for 4 desired target ranges... The maximum number of iterations was set to 100. In our experiments, the constant that controls the trade-off between global search and local optimisation (i.e., exploration/exploitation) is set to ξ = 0.01... We used k = 100 for our experiments, which provides the minimum difference of 1% relative to the attribute range.