Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Authors: Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments aim to answer the following questions: (1) Can Hyper Distill achieve good performance on both training and unseen test robots? How does it compare to other methods w.r.t. performance and efficiency at inference time? (Section 4.2) (2) How do different algorithmic and architecture choices in Hyper Distill influence its training and generalization performance? (Section 4.3) |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of Oxford, Oxford, UK 2Huawei Noah s Ark Lab, London, UK. |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It describes the methods in prose and with diagrams. |
| Open Source Code | Yes | The code is publicly available at https://github.com/Master Xiong/ Universal-Morphology-Control. |
| Open Datasets | Yes | We experiment on the UNIMAL benchmark (Gupta et al., 2021) built upon the Mujoco simulator (Todorov et al., 2012), which includes 100 training robots and 100 test robots with diverse morphologies |
| Dataset Splits | No | The paper mentions 100 training robots and 100 test robots, and an augmented set of 1000 PD robots for distillation. It does not explicitly define a separate validation split or subset for hyperparameter tuning or model selection in the main text. |
| Hardware Specification | No | The paper mentions that "The experiments were made possible by a generous equipment grant from NVIDIA." However, it does not provide specific details such as GPU models (e.g., A100, V100), CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using the "Mujoco simulator" and "Adam" optimizer, but it does not specify version numbers for these or any other software components (e.g., Python, PyTorch, TensorFlow, etc.) used in the experiments. |
| Experiment Setup | Yes | The distillation process runs for 150 epochs, with a mini batch size of 5120. We use Adam with a learning rate of 0.0003, and clip the gradient norm to 0.5. For Hyper Distill, we apply dropout to context embedding with p = 0.1. |