reproducibilityindex.ai

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

Authors: Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results show CLUTR outperforms PAIRED, a principled and popular UED method, in the challenging Car Racing and navigation environments: achieving 10.6X and 45% improvement in zero-shot generalization, respectively. CLUTR also performs comparably to the non-UED state-of-the-art for Car Racing, while requiring 500X fewer environment interactions.
Researcher Affiliation	Collaboration	1University of California, Berkeley 2Google Research.
Pseudocode	Yes	Algorithm 1 CLUTR
Open Source Code	Yes	We open source our code at https://github.com/clutr/clutr.
Open Datasets	No	To train our VAEs, we generate random tasks by uniformly sampling from ΘT , the set of possible tasks. Thus, we do not require any interaction with the environment to learn the task manifold. ... For Car Racing, ... We train the VAE on 1 million randomly generated tracks... For navigation tasks ... we generated 10 million random grids... The paper describes how the training data was generated but does not provide a link or specific citation for publicly accessing these generated datasets.
Dataset Splits	No	The paper mentions testing on specific benchmarks but does not specify how the data was split into training, validation, and test sets with exact percentages, sample counts, or a detailed splitting methodology.
Hardware Specification	Yes	We used a single NVIDIA T4 GPUs for our experiments with machines having 8(16) and 16(32) physical(virtual) cores, 64GB and 128 GB Memory for Car Racing and Minigrid experiments.
Software Dependencies	No	The paper mentions using PPO (Schulman et al., 2017) and Adam for training, but does not provide specific version numbers for these libraries or other software dependencies (e.g., Python, PyTorch, TensorFlow versions) that would be needed for reproducibility.
Experiment Setup	Yes	All our agents are trained with PPO (Schulman et al., 2017). We did not perform any hyperparameter search for our experiments. The Car Racing experiments used the same parameters used in Jiang et al. (2021a) and the Minigrid experiments used the parameters from Dennis et al. (2020). The VAE used for Car Racing and Minigrid standard objective experiments (Section E.2) were trained using the default parameters from Bowman et al. (2015). The detailed parameters are listed in Table 2 and Table 3.