Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Authors: Runfa Chen, Ling Wang, Yu Du, Tianrui Xue, Fuchun Sun, Jianwei Zhang, Wenbing Huang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments", "5.1. Experimental Setup", "5.2. Evaluations in Diverse Environments", "5.3. Extended Evaluations on Transformer", "5.4. Ablation Studies". Figure 4. Training and Evaluation Curves in Team Reach Environments. Table 2. Evaluations on Basic Architectures.
Researcher Affiliation Collaboration 1Dept. of Comp. Sci. & Tech. , Institute for AI, BNRist Center, Tsinghua University 2Dept. of Info. Eng., Xi an Research Institute of High-Tech 3School of Elec. Eng., Naval University of Engineering 4College of Arts & Sci., New York Unviersity 5THU-Bosch JCML Center 6TAMS, Dept. of Informatics, University of Hamburg 7Gaoling School of Artificial Intelligence, Renmin University of China 8Beijing Key Laboratory of Big Data Management and Analysis Methods.
Pseudocode Yes Algorithm 1 Greedy Bipartite Matching for Task Assignment
Open Source Code Yes Codes are available on our project page: https://alpc91.github.io/SMERL/.
Open Datasets Yes we propose the Multientity Benchmark (MEBEN), a new suite of environments tailored for exploring a wide range of multi-entity reinforcement learning. Built upon JAX-based RL environments (Bradbury et al., 2018; Godwin* et al., 2020; Heek et al., 2023; Freeman et al., 2021; Gu et al., 2021), MEBEN is designed to investigate multi-entity interactions... Code and Environments are available on our project page: https://alpc91.github.io/SMERL/.
Dataset Splits No The paper conducts evaluations ('Training and Evaluation Curves') and ablations, and mentions 'MAPPO involves the development of both the joint policy πθ and a value function Vϕ, crucial for variance reduction and integrating information beyond the agents local observations', but it does not specify a distinct 'validation dataset' split with sizes or percentages for hyperparameter tuning.
Hardware Specification No MEBEN harnesses the advanced capabilities of the Brax physics simulator (Freeman et al., 2021) and Composer (Gu et al., 2021)... This advancement propels MEBEN into facilitating efficient and scalable hardwareaccelerated iterations on GPUs or TPUs, making it exceptionally suitable for morphology-based reinforcement learning experiments." This mentions general types (GPUs/TPUs) but no specific models or configurations.
Software Dependencies No SHNN is developed based on the Mx T-bench(Furuta et al., 2023) codebase, leveraging JAX(Bradbury et al., 2018) and Brax (Freeman et al., 2021; Gu et al., 2021) for efficient, hardware-accelerated simulations." While software libraries are mentioned, no version numbers are specified.
Experiment Setup Yes Hyperparameter details are in Appendix C.9." (referring to Table 7 and Table 8 which list specific hyperparameters like total timesteps, learning rate, batch size, hidden dimensions, etc.).