Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Authors: Xinyue Wang, Biwei Huang
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on numerical simulations and real-world robotic manipulation tasks demonstrate that WM3C significantly outperforms existing methods in identifying latent processes, improving policy learning, and generalizing to unseen tasks. |
| Researcher Affiliation | Academia | Xinyue Wang1, Biwei Huang1 1University of California San Diego EMAIL |
| Pseudocode | No | The paper includes mathematical formulations and descriptions of its algorithm, but does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | 1Project website: https://www.charonwangg.com/project/wm3c |
| Open Datasets | Yes | We conduct experiments in the robotic simulation environment, Meta-world (Yu et al., 2019). |
| Dataset Splits | Yes | We split all possible combinations into a training set for the i.i.d component identification test and a test set for the o.o.d component prediction test. ... We take 18 training tasks that follow this structure and 9 test tasks that either are new combinations of known language components or mixtures of known and unknown language components. |
| Hardware Specification | Yes | All experiments are conducted a 4 Nvidia 3090 GPU. |
| Software Dependencies | No | We utilize the state-of-the-art model-based reinforcement learning algorithm, Dreamer V3 (Hafner et al., 2023)... For WM3C CNN, we build upon the JAX implementation of Dreamer V3 small... For Dreamer V3, we take the official JAX implementation of Dreamer V3 (Hafner et al., 2023) from https://github.com/danijar/dreamerv3... For visual-based multi-task SAC (MT-SAC), we take the visual-based SAC implementation from https://github.com/Karl Xing/RL-Visual-Continuous-Control... The paper mentions specific software frameworks and implementations but does not provide version numbers for key dependencies like JAX or other libraries. |
| Experiment Setup | Yes | Table 2: Hyperparameters for WM3C CNN implementation in Meta-world. ... Table 3: MAE and Vi T hyperparameters for WM3C MAE in Meta-world. |