Evolving Populations of Diverse RL Agents with MAP-Elites
Authors: Thomas PIERROT, Arthur Flajolet
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the benefits brought about by our framework through extensive numerical experiments on a number of robotics control problems, some of which with deceptive rewards, taken from the QD-RL literature. We open source an efficient JAX-based implementation of our algorithm in the QDax library 1. Figure 2: Performance comparison of PBT-MAP-ELITES with baselines on the basis of standard metrics from the QD literature for five environments from the QDAX suite (which is based on the BRAX engine). |
| Researcher Affiliation | Industry | Thomas Pierrot Insta Deep t.pierrot@instadeep.com Arthur Flajolet Insta Deep a.flajolet@instadeep.con |
| Pseudocode | Yes | A PSEUDOCODES FOR ALL ALGORITHMS |
| Open Source Code | Yes | We open source an efficient JAX-based implementation of our algorithm in the QDax library 1. 1https://github.com/adaptive-intelligent-robotics/QDax |
| Open Datasets | Yes | All of these environments are based on the BRAX simulator (Freeman et al., 2021) and are available in the QDAX suite (Lim et al., 2022). |
| Dataset Splits | No | The paper does not explicitly describe specific training, validation, and test dataset splits with percentages or sample counts. The problem setup involves continuous environment interaction rather than static dataset splits. |
| Hardware Specification | No | The paper mentions "modern libraries, such as JAX (Bradbury et al., 2018), that seamlessly enable not only to distribute the computations, including computations taking place in the physics engine with BRAX (Freeman et al., 2021), over multiple accelerators" but does not specify exact hardware models (e.g., GPU/CPU types, specific accelerators) used for its experiments. |
| Software Dependencies | No | The paper mentions "JAX (Bradbury et al., 2018)" and "BRAX (Freeman et al., 2021)" as well as the "QDax library", but it does not specify concrete version numbers for any of these software dependencies. |
| Experiment Setup | Yes | In this section, we detail the parameters used for all algorithms. [...] Table 1: PBT parameters. Table 2: PBT-MAP-ELITES parameters. Table 3: MAP-ELITES parameters. Table 4: ME-ES parameters. Table 5: PGA-MAP-ELITES parameters. Table 6: QD-PG parameters. Table 7: SAC hyperparameters ranges (or values if the hyperparameter does not change during training) that PBT and PBT-MAP-ELITES sample from. Table 8: TD3 hyperparameters ranges (or values if the hyperparameter does not change during training) that PBT and PBT-MAP-ELITES sample from. |