AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
Authors: Brandon Trabucco, Mariano Phielipp, Glen Berseth
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on the standard benchmark for agent-agnostic control, and improve over the current state of the art in zero-shot generalization to new agents. Importantly, our method attains good performance without an explicit description of morphology. |
| Researcher Affiliation | Collaboration | Brandon Trabucco 1 Mariano Phielipp * 2 Glen Berseth * 3 1Machine Learning Department, Carnegie Mellon University, work done while at Intel AI 2Intel AI 3Mila. |
| Pseudocode | No | No explicit pseudocode or algorithm block is present in the paper. The methodology is described in prose and through diagrams. |
| Open Source Code | Yes | Additionally, we have released the source code for our method and summarized how the model works at the following site. |
| Open Datasets | Yes | To answer these questions, we leverage a benchmark for agent-agnostic reinforcement learning developed by Huang et al. (2020, p. 1). This benchmark contains a set of eight reinforcement learning tasks... The agents present in this benchmark and inspired by and derived from standard Open AI Gym tasks: Half Cheetah-v2, Walker2d-2, Hopper-v2, and Humanoid-v2 (Brockman et al., 2016). |
| Dataset Splits | Yes | To answer this question, we follow Kurin et al. (2021) and hold out 3 Cheetahs, 2 Walkers, and 2 Humanoids respectively. See Appendix C for which specific morphologies are used for testing. |
| Hardware Specification | Yes | Our model fits on a single Nvidia 2080ti GPU, and requires seven days of training to reach 3 million environments steps. |
| Software Dependencies | No | The paper mentions using TD3, Mu Jo Co-like agents, and Open AI Gym tasks, but does not specify software dependencies like programming language or library versions (e.g., Python version, PyTorch version) needed for reproducible setup. |
| Experiment Setup | Yes | We provide a table of hyperparameters in Appendix A for our policy and reinforcement learning optimizer. |