Subequivariant Graph Reinforcement Learning in 3D Environments
Authors: Runfa Chen, Jiaqi Han, Fuchun Sun, Wenbing Huang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we evaluate the proposed method on the proposed benchmarks, where our method consistently and significantly outperforms existing approaches on single-task, multi-task, and zero-shot generalization scenarios. Extensive ablations are also conducted to verify our design. |
| Researcher Affiliation | Collaboration | 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2THU-Bosch JCML Center 3Gaoling School of Artificial Intelligence, Renmin University of China 4Beijing Key Laboratory of Big Data Management and Analysis Methods. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and videos are available on our project page: https://alpc91.github.io/SGRL/. and Our codes are available on https://github.com/alpc91/SGRL. |
| Open Datasets | Yes | The environments in our 3D-SGRL are modified from the default 2D-planar setups in Mu Jo Co (Todorov et al., 2012). Specifically, we extend agents in environments including Hopper, Walker, Humanoid and Cheetah (Huang et al., 2020) into 3D counterparts. |
| Dataset Splits | No | We keep 20% of the variants as the zero-shot testing set and use the rest for training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions "Py Torch" and "Mu Jo Co" as software used, but does not provide specific version numbers for these or any other key software dependencies. |
| Experiment Setup | Yes | Table 6 provides the hyperparameters needed to replicate our experiments. Table 6 lists hyperparameters such as "Learning rate 0.0001", "Mini-batch size 100", and "Total attention layers 3". |