Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Counterfactuals uncover the modular structure of deep generative models
Authors: Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS We first investigated modularity of genrative models trained on the Celeb Faces Attributes Dataset (Celeb A)(Liu et al., 2015) and used a basic architecture: a plain β-VAE 2 (Higgins et al., 2017). We ran the full procedure described in Sec. 3, comprised of EIM calculations, clustering of channels into modules, and hybridization of generator samples using these modules. |
| Researcher Affiliation | Academia | Michel Besserve1,2, , Arash Mehrjou1,3, R emy Sun1,3, Bernhard Sch olkopf1 1. MPI for Intelligent Systems, T ubingen, Germany. 2. MPI for Biological Cybernetics, T ubingen, Germany. 3. Department of Computer Science, ETH Z urich, Switzerland. 4. ENS Rennes, France. |
| Pseudocode | No | The paper describes methods textually (e.g., 'the hybridization procedure... goes as follows'), but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | Implementations are available on the companion website https://gitlab.tuebingen.mpg.de/besserve/ deepcounterfactuals. |
| Open Datasets | Yes | Celeb Faces Attributes Dataset (Celeb A)(Liu et al., 2015) and Image Net dataset6. http://www.image-net.org/ |
| Dataset Splits | No | The paper mentions using specific datasets like Celeb A and ImageNet, and training models or using pre-trained models, but it does not explicitly provide details on how the datasets were split into training, validation, and test sets for their experiments. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running experiments or training models. |
| Software Dependencies | No | The paper mentions software like 'tensorlayer DCGAN implementation' and 'Tensorflow-hub' for pre-trained models, but it does not specify concrete version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Hyperparameters for both structures are specified in Table 1. Optimization algorithm Adam (β = 0.5) Minimized objective VAE loss (Gaussian posteriors) batch size 64 Beta parameter 0.0005 |