Megaverse: Simulating Embodied Agents at One Million Experiences per Second

Authors: Aleksei Petrenko, Erik Wijmans, Brennan Shacklett, Vladlen Koltun

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate model-free RL on this benchmark to provide baselines and facilitate future research. The source code is available at www.megaverse.info. ... We start by benchmarking the performance of the Megaverse platform. We examine pure simulation speed, when no inference or learning are done, as well as performance of Megaverse environments as a part of a full RL training system. ... Ablation study. We examine the impact of two key performance optimizations in Megaverse: batched rendering (Section 3.2) and geometry optimization (Section 3.3). The results show that both of these techniques are required to achieve high throughput (Table 2).
Researcher Affiliation Collaboration 1Intel Labs 2University of Southern California 3Georgia Institute of Technology 4Stanford University. Correspondence to: Aleksei Petrenko <petrenko@usc.edu>.
Pseudocode No No pseudocode or algorithm blocks found.
Open Source Code Yes The source code is available at www.megaverse.info
Open Datasets No The paper states that its environments are "procedurally generated", meaning data is generated on the fly, not from a fixed publicly available dataset. While it compares to other environments like "Arcade Learning Environment (ALE) (Bellemare et al., 2013)", no concrete access information (link, DOI, etc.) for their datasets is provided within this paper. The paper's own Megaverse-8 benchmark environments are accessible via the source code, but this refers to the generative capability rather than a static dataset.
Dataset Splits No The paper does not provide specific details on train/validation/test splits, instead relying on procedurally generated environments for experiments.
Hardware Specification Yes System #1 (12x CPU, 1x RTX3090) ... System #2 (36x CPU, 4x RTX2080Ti) ... System #3 (48x CPU, 8x RTX2080Ti) ... Performance measured on a 10-core 1x GTX1080Ti system
Software Dependencies No The paper mentions "Py Torch GPU-side tensors (Paszke et al., 2019)" and uses "Sample Factory (Petrenko et al., 2020)", but it does not specify exact version numbers for these software dependencies, which are required for reproducibility.
Experiment Setup No The paper describes the general training framework, e.g., "asynchronous proximal policy optimization (PPO) (Schulman et al., 2017) with V-trace off-policy correction (Espeholt et al., 2018) using the Sample Factory implementation (Petrenko et al., 2020)" and mentions training for "2 109 environment steps" and "Team Spirit reward shaping". However, it lacks specific numerical hyperparameters such as learning rate, batch size, or optimizer settings, which are crucial for full reproducibility.