reproducibilityindex.ai

Composing Ensembles of Pre-trained Models via Iterative Consensus

Authors: Shuang Li, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed framework for composing pre-trained models on four representative tasks, including image generation, video question answering, grade school math, and robot manipulation. We compare the proposed method with baselines on the above four zero-shot tasks.
Researcher Affiliation	Collaboration	MIT CSAIL lishuang@mit.edu Yilun Du MIT CSAIL yilundu@mit.edu Joshua B. Tenenbaum MIT CSAIL, BCS, CBMM jbt@mit.edu Antonio Torralba MIT CSAIL torralba@mit.edu Igor Mordatch Google Brain imordatch@google.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks (i.e., sections or figures explicitly labeled 'Pseudocode' or 'Algorithm').
Open Source Code	No	The paper does not provide a direct statement or link for the open-source code of the described methodology. It only links to third-party pre-trained models used.
Open Datasets	Yes	We evaluate the image generation results on Image Net (Deng et al., 2009)... We evaluate methods for solving VQA tasks on Activity Net-QA (Yu et al., 2019). GSM8K (Cobbe et al., 2021) is a dataset for grade school math problems... We next evaluate how pre-trained models may be used to manipulate objects in Ravens (Zeng et al., 2020).
Dataset Splits	No	The paper mentions datasets used for evaluation and a test set for GSM8K, but it does not specify explicit train/validation/test splits (e.g., percentages or counts) for all datasets needed to reproduce the data partitioning, nor does it cite specific predefined splits for all datasets.
Hardware Specification	Yes	We use TITAN RTX 24GB GPUs for all the experiments.
Software Dependencies	No	The paper mentions using the 'Huggingface library (Wolf et al., 2019)' for CLIP models and provides URLs for specific CLIP model checkpoints. However, it does not provide specific version numbers for the Huggingface library itself or other key software components, which is required for a reproducible description of ancillary software.
Experiment Setup	Yes	The guidance scale is set to 3. (for image generation) ... In our experiments, we use 5 steps of gradient descent. The learning rate α is set to 0.3. (for VQA and grade school math)