reproducibilityindex.ai

Towards Principled Representation Learning from Videos for Reinforcement Learning

Authors: Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically test our theoretical results in three visual domains, yielding results that are consistent with our theoretical findings.
Researcher Affiliation	Industry	Dipendra Misra1 Akanksha Saran2 Tengyang Xie1 Alex Lamb1 John Langford1 1Microsoft Research, NY 2Sony Research, CA
Pseudocode	No	The paper describes methods textually and mathematically in sections like "3 REPRESENTATION LEARNING FOR RL USING VIDEO DATASET" and "B PROOFS OF THEORETICAL STATEMENTS", but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code for all experiments is available as part of the Intrepid codebase at https://github.com/microsoft/Intrepid.
Open Datasets	Yes	We empirically test our theoretical results in three visual domains: Grid World (a navigation domain), Vi ZDoom basic (a first-person 3D shooting game), and Vi ZDoom Defend The Center (a more challenging first-person 3D shooting game).
Dataset Splits	No	The paper discusses concepts of training, validation, and testing phases for machine learning models and experiments in general. However, it does not provide specific details on how the datasets used in their experiments (Grid World, Vi ZDoom) were split into training, validation, and test sets (e.g., percentages, absolute counts, or specific predefined split citations).
Hardware Specification	Yes	All the code for this work was run on A100, V100, P40 GPUs, with a compute time of approx. 12 hours for grid world experiments and 6 hours for Vi ZDoom experiments.
Software Dependencies	No	The paper mentions software components like 'PPO' (Proximal Policy Optimization) and uses environments such as 'Minigrid' and 'Vi ZDoom'. However, it does not provide specific version numbers for these or any other ancillary software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be necessary for replication.
Experiment Setup	Yes	In Table 2, we report the hyperaparameter values used for experiments in this work with the Grid World and Vi ZDoom environments.