FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis

Authors: Aman Sinha, Matthew O’Kelly, Hongrui Zheng, Rahul Mangharam, John Duchi, Russ Tedrake

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 4 details the practical implications of the theoretical results, emergent properties of the method, and the experimental performance of our approach. Current autonomous vehicle (AV) technology still struggles in competitive multi-agent scenarios, such as merging onto a highway, where both maximizing performance (negotiating the merge without delay or hesitation) and maintaining safety (avoiding a crash) are important. The strategic implications of this tradeoff are magnified in racing. During the 2019 Formula One season, the race-winner achieved the fastest lap in only 33% of events (Federation Internationale de l Automobile, 2019). Empirically, the weak correlation between achieving the fastest lap-time and winning suggests that consistent and robust performance is critical to success. In this paper, we investigate this intuition in the setting of autonomous racing (AR). In AR, an AV must lap a racetrack in the presence of other agents deploying unknown policies. The agent wins if it completes the race faster than its opponents; a crash automatically results in a loss.
Researcher Affiliation Academia 1Stanford University, Stanford, CA, USA 2University of Pennsylvania, Philadelphia, PA, USA 3Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
Pseudocode Yes Algorithm 1 AADAPT; Algorithm 2 EXP3 with Nw arm-pulls per iteration
Open Source Code Yes The hardware specifications, software, and simulator are open-source 2 (see Appendices C and D for details). 2https://github.com/travelbureau/f0_icml_code
Open Datasets No The paper describes generating data through self-play within a custom simulator built with an Open AI Gym API. It does not provide access to a pre-existing, publicly available dataset in the form of a file or repository, nor does it cite a standard public dataset for its main training.
Dataset Splits No The paper describes simulated experiments and real-world validation, but it does not specify explicit training, validation, and testing dataset splits (e.g., percentages or counts) as commonly done for fixed datasets.
Hardware Specification No The paper refers to a 'low-cost 1/10th-scale, Ackermann-steered AV' and 'an embedded processor on board the vehicle' but does not provide specific hardware details such as CPU/GPU models, processor types, or memory amounts used for the experiments in the provided text.
Software Dependencies No The paper mentions software components such as 'Open AI Gym API', 'masked autoregressive flow (MAF)', 'inverse autogressive flow (IAF)', and 'Adam' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We run AADAPT with L = 5 populations, D = 160 configurations per population, and T = 100 iterations. For vertical MCMC steps, we randomly sample 16 configuratons per population and perform V = 2 iterations of 5 hit-and-run proposals. Furthermore, we perform E = DL2/ t/(L 1) horizontal steps (motivated by the fact fact that tunneling from the highest-temperature level to the coldest takes O(L2) accepted steps). Finally, for training , we use Adam (Kingma & Ba, 2014) with a learning rate of 10 4. For a given robustness level /Nw 2 {0.001, 0.025, 0.2, 0.4, 0.75, 1.0} (with Nw = 8 for all experiments)...