Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments
Authors: Runfa Chen, Ling Wang, Yu Du, Tianrui Xue, Fuchun Sun, Jianwei Zhang, Wenbing Huang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments", "5.1. Experimental Setup", "5.2. Evaluations in Diverse Environments", "5.3. Extended Evaluations on Transformer", "5.4. Ablation Studies". Figure 4. Training and Evaluation Curves in Team Reach Environments. Table 2. Evaluations on Basic Architectures. |
| Researcher Affiliation | Collaboration | 1Dept. of Comp. Sci. & Tech. , Institute for AI, BNRist Center, Tsinghua University 2Dept. of Info. Eng., Xi an Research Institute of High-Tech 3School of Elec. Eng., Naval University of Engineering 4College of Arts & Sci., New York Unviersity 5THU-Bosch JCML Center 6TAMS, Dept. of Informatics, University of Hamburg 7Gaoling School of Artificial Intelligence, Renmin University of China 8Beijing Key Laboratory of Big Data Management and Analysis Methods. |
| Pseudocode | Yes | Algorithm 1 Greedy Bipartite Matching for Task Assignment |
| Open Source Code | Yes | Codes are available on our project page: https://alpc91.github.io/SMERL/. |
| Open Datasets | Yes | we propose the Multientity Benchmark (MEBEN), a new suite of environments tailored for exploring a wide range of multi-entity reinforcement learning. Built upon JAX-based RL environments (Bradbury et al., 2018; Godwin* et al., 2020; Heek et al., 2023; Freeman et al., 2021; Gu et al., 2021), MEBEN is designed to investigate multi-entity interactions... Code and Environments are available on our project page: https://alpc91.github.io/SMERL/. |
| Dataset Splits | No | The paper conducts evaluations ('Training and Evaluation Curves') and ablations, and mentions 'MAPPO involves the development of both the joint policy πθ and a value function Vϕ, crucial for variance reduction and integrating information beyond the agents local observations', but it does not specify a distinct 'validation dataset' split with sizes or percentages for hyperparameter tuning. |
| Hardware Specification | No | MEBEN harnesses the advanced capabilities of the Brax physics simulator (Freeman et al., 2021) and Composer (Gu et al., 2021)... This advancement propels MEBEN into facilitating efficient and scalable hardwareaccelerated iterations on GPUs or TPUs, making it exceptionally suitable for morphology-based reinforcement learning experiments." This mentions general types (GPUs/TPUs) but no specific models or configurations. |
| Software Dependencies | No | SHNN is developed based on the Mx T-bench(Furuta et al., 2023) codebase, leveraging JAX(Bradbury et al., 2018) and Brax (Freeman et al., 2021; Gu et al., 2021) for efficient, hardware-accelerated simulations." While software libraries are mentioned, no version numbers are specified. |
| Experiment Setup | Yes | Hyperparameter details are in Appendix C.9." (referring to Table 7 and Table 8 which list specific hyperparameters like total timesteps, learning rate, batch size, hidden dimensions, etc.). |