reproducibilityindex.ai

Measuring axiomatic soundness of counterfactual image models

Authors: Miguel Monteiro, Fabio De Sousa Ribeiro, Nick Pawlowski, Daniel C. Castro, Ben Glocker

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now demonstrate the utility of our evaluation framework by applying it to three datasets.
Researcher Affiliation	Collaboration	1Imperial College London, 2Microsoft Research Cambridge.
Pseudocode	Yes	D.2.1 PSEUDO-ORACLES. Architecture. The architecture of the pseudo-oracles is: pseudo_oracle = serial( Conv(out_chan=64, filter_shape=(4, 4), strides=(2, 2)), Leaky Relu, Conv(out_chan=64, filter_shape=(4, 4), strides=(2, 2)), Leaky Relu, Flatten, Dense(out_dim=128), Leaky Relu, Dense(out_dim=num_classes if classification else 1) ) y_hat = pseudo_oracle(image)
Open Source Code	No	The paper does not contain any explicit statement about releasing its own source code or a link to a repository for the methods described.
Open Datasets	Yes	We apply it to three datasets. For demonstration purposes, we assume invertible mechanisms so we can use the reversibility metric. 4.1 COLOUR MNIST ... using the MNIST dataset (Le Cun et al., 1998) ... 4.2 3D SHAPES ... using the 3D shapes dataset (Burgess & Kim, 2018) ... 4.3 CELEBA-HQ ... the Celeb A-HQ dataset (Karras et al., 2018)
Dataset Splits	Yes	We keep 10% of images as a test set and train on the remaining 90%. (for 3D Shapes) ... we randomly split the 30,000 examples into 70% for training, 15% for validation and 15% for testing. (for Celeb A-HQ)
Hardware Specification	No	The paper mentions 'compute constraints' but does not specify any particular GPU, CPU, or other hardware model numbers used for running experiments.
Software Dependencies	No	The paper mentions 'JAX framework' and 'PyTorch (Paszke et al., 2019)' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	Training details. We trained the pseudo-oracles for 2000 steps with a batch size of 1024 using the Adam W (Loshchilov & Hutter, 2019) optimiser with a learning rate of 0.0005, β1 = 0.9, β2 = 0.999 and weight_decay = 0.0001.