reproducibilityindex.ai

Mapping the Multiverse of Latent Representations

Authors: Jeremy Wayland, Corinna Coupette, Bastian Rieck

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To address our guiding questions, we generate multiverses for two types of generative models, i.e., variational autoencoders as generators of images, and transformers as generators of natural language. We are particularly interested in the impact of algorithmic choices A, implementation choices I, and data choices D on the generated representations. Further experiments (including on dimensionality reduction), a multiverse analysis of the choices involved in the PRESTO pipeline, and more details on all results reported here can be found in Appendix B.
Researcher Affiliation	Academia	1Helmholtz Munich 2Technical University of Munich 3KTH Royal Institute of Technology 4Max Planck Institute for Informatics.
Pseudocode	No	No section or block explicitly labeled as "Pseudocode" or "Algorithm" was found. The PRESTO pipeline steps are described in prose rather than structured code-like format.
Open Source Code	Yes	Reproducibility Statement We make all code, data, and results publicly available. Reproducibility materials are available at https://doi.org/10.5281/zenodo.11355446, and our code is maintained at https://github.com/aidos-lab/Presto.
Open Datasets	Yes	We train on five datasets: (1) celeb A, (2) CIFAR-10, (3) dsprites, (4) Fashion MNIST, and (5) MNIST.
Dataset Splits	Yes	Each model was trained using a random [0.6, 0.3, 0.1] train/validation/test split for each of our five datasets.
Hardware Specification	No	No specific hardware details (e.g., exact CPU/GPU models, memory sizes) were provided. The paper only mentions experiments were run on a "single CPU" in its empirical analysis of running times.
Software Dependencies	No	No specific version numbers for software dependencies were provided. The paper mentions using "sentence-transformers library" and "PYTORCH autoencoder frameworks" but without version numbers.
Experiment Setup	Yes	Each model was trained using an ADAM optimizer with a learning rate of 0.001 over 30 epochs.