Counterfactual Generative Networks
Authors: Axel Sauer, Andreas Geiger
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the ability of our model to generate such images on MNIST and Image Net. Further, we show that the counterfactual images can improve out-of-distribution robustness with a marginal drop in performance on the original classification task, despite being synthetic. Lastly, our generative model can be trained efficiently on a single GPU, exploiting common pre-trained models as inductive biases. |
| Researcher Affiliation | Academia | Axel Sauer1,2 & Andreas Geiger1,2 Autonomous Vision Group 1Max Planck Institute for Intelligent Systems, T ubingen 2University of T ubingen {firstname.lastname}@tue.mpg.de |
| Pseudocode | No | The paper describes its methods verbally and through architectural diagrams (e.g., Figure 2) but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release our code at https://github.com/autonomousvision/counterfactual generative networks |
| Open Datasets | Yes | We demonstrate the ability of our model to generate such images on MNIST and Image Net. |
| Dataset Splits | No | The paper mentions training and testing, and uses metrics like Inception Score during training, but does not explicitly detail a separate 'validation' dataset split with percentages or counts for reproducibility. |
| Hardware Specification | Yes | whole CGN on a single NVIDIA GTX 1080Ti within 12 hours |
| Software Dependencies | No | We use a Res Net-50 from Py Torch torchvision. We use the pre-trained Big GAN models from https://github.com/huggingface/ pytorch-pretrained-Big GAN. We use Adam (Kingma & Ba, 2014). |
| Experiment Setup | Yes | We use the following lambdas: λ1 = 100, λ2 = 5, λ3 = 300, λ4 = 500, λ5 = 5, λ6 = 2000. For the optimization we use Adam (Kingma & Ba, 2014), and set the learning rate of fshape to 8e-6, and for both ftext and fbg to 1e-5. We train for 70 episodes with Stochastic Gradient Descent using a batch size of 512. Of the 512 images, 256 are real images, 256 are counterfactual images. We use a momentum of 0.9, weight decay (1e-4), and a learning rate of 0.1, multiplied by a factor of 0.001 after 30 and 60 epochs. |