reproducibilityindex.ai

From Causal to Concept-Based Representation Learning

Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic data, multimodal CLIP models and large language models supplement our results and show the utility of our approach.
Researcher Affiliation	Academia	1Machine Learning Dept., Carnegie Mellon University, Pittsburgh, USA 2Max Planck Institute for Intelligent Systems, Tübingen, Germany 3University of Chicago, Chicago, USA 4 ELLIS Institute, Tübingen, Germany
Pseudocode	Yes	Algorithm 1: Rejection sampling for controllable generative modeling
Open Source Code	Yes	The codes, along with instructions on how to run them, is attached in supplementary material.
Open Datasets	Yes	We embed images from the 3d-Shapes Dataset [16] with known factors of variation into the latent space of two different pretrained CLIP models.
Dataset Splits	No	We split the embedded images in to training and test sets of equal size. The paper only mentions training and test sets and does not provide an explicit validation set split or its percentages/counts.
Hardware Specification	Yes	The preprocessing to calculate the CLIP image embeddings required few hours on a A100-GPU... We train for 100 epochs, on a single A6000 GPU... The experiments are performed on eight A6000 GPUs.
Software Dependencies	No	We use the open-source large language model LLa MA [119] with 7 billion parameters (open sourced version from Hugging Face) and the sentence transformer SBERT [97] for the sentence embedding. The paper mentions software names but does not provide specific version numbers for these software components.
Experiment Setup	Yes	For the contrastive algorithm, we choose the architecture to either be linear or nonlinear with a 2-layer MLP with 32 hidden neurons in each layer... We train for 100 epochs... with η = 0.0001 and use Adam optimizer with learning rates 0.5 for the parametric layer and 0.005 for the non-parametric layer, with a Cosine Annealing schedule [72].