reproducibilityindex.ai

Bridging Environments and Language with Rendering Functions and Vision-Language Models

Authors: Theo Cachet, Christopher R Dance, Olivier Sigaud

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed methods on the Humanoid environment, showing that it results in LCAs that outperform MTRL baselines in zero-shot generalization, without requiring any textual task descriptions or other forms of environment-specific annotation during training.
Researcher Affiliation	Collaboration	1NAVER LABS Europe, Meylan 2Institute of Intelligent Systems and Robotics, Sorbonne University, Paris.
Pseudocode	Yes	Algorithm 1 Gradient-based configuration finetuning
Open Source Code	No	The paper provides a link to an interactive demo and videos (https://europe.naverlabs.com/text2control), but it does not explicitly state that the source code for the methodology is open-source or available via a repository link.
Open Datasets	Yes	We evaluate our approach on the Humanoid environment from Open AI s Gym framework (Brockman et al., 2016)... Large-scale internet-scraped text and image data is a key enabler of current LLMs and text-to-image models (Schuhmann et al., 2022; Gadre et al., 2023; Penedo et al., 2023).
Dataset Splits	No	The paper mentions training and test sets but does not explicitly specify a validation dataset split or provide details for how data was partitioned for validation purposes in its own experiments.
Hardware Specification	Yes	using an NVIDIA RTX A6000 GPU and a 40-core Intel Xeon w7-2475X
Software Dependencies	Yes	We use the Humanoid environment from Open AI s Gym framework (Brockman et al., 2016)...Rendering is performed using Mu Jo Co rendering functions...Mu Jo Co > 2.0.3
Experiment Setup	Yes	Table 6. Hyperparameters used when training the STRL, MTRL and GCRL agents with PPO. Hyperparameter Value Clipping 0.2 Discount factor, γ 0.999 GAE parameter, λ 0.95 Update time-step 204 800 (MTRL and GCRL), 25 600 (STRL) Batch size 102 400 (MTRL and GCRL), 12 800 (STRL) Epochs 10 Learning rate 5e-4 Learning-rate schedule linear annealing Gradient norm clipping 0.5 Value clipping no Entropy coefficient 2.5e-2 Value coefficient 0.5 Activation function Ge LU (Hendrycks & Gimpel, 2016) Optimizer Adam W (Loshchilov & Hutter, 2017) Weight decay 0.01