Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
Authors: Deepak Pathak, Christopher Lu, Trevor Darrell, Phillip Isola, Alexei A. Efros
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of these dynamic and modular agents in simulated environments. We demonstrate better generalization to test-time changes both in the environment, as well as in the structure of the agent, compared to static and monolithic baselines. |
| Researcher Affiliation | Academia | Deepak Pathak UC Berkeley Chris Lu UC Berkeley Trevor Darrell UC Berkeley Phillip Isola MIT Alexei A. Efros UC Berkeley |
| Pseudocode | Yes | DGN pseudo-code (as well as source code) and all training implementation details and are in Section 1.1,1.4 of the supplementary. |
| Open Source Code | Yes | Project video and code are available at https://pathak22.github.io/modular-assemblies/. |
| Open Datasets | No | The paper states that the authors created their own environments because existing benchmarks did not support their research needs. No specific public dataset is used or provided with access information for training. |
| Dataset Splits | No | The paper does not explicitly provide specific training/test/validation dataset splits (percentages or counts) or refer to predefined validation splits with citations. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It only mentions the Unity ML framework. |
| Software Dependencies | No | The paper mentions "Unity ML" and "Mujoco gym environments" but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Across all the tasks, the number of limbs at training is kept fixed to 6. At test, we report the mean reward across 50 episodes of 1200 environment steps. The reward function for locomotion is defined as the distance covered by the agent along X-axis. Limbs start each episode disconnected and located just above the ground plane at random locations. |