FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis
Authors: Aman Sinha, Matthew O’Kelly, Hongrui Zheng, Rahul Mangharam, John Duchi, Russ Tedrake
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 details the practical implications of the theoretical results, emergent properties of the method, and the experimental performance of our approach. Current autonomous vehicle (AV) technology still struggles in competitive multi-agent scenarios, such as merging onto a highway, where both maximizing performance (negotiating the merge without delay or hesitation) and maintaining safety (avoiding a crash) are important. The strategic implications of this tradeoff are magnified in racing. During the 2019 Formula One season, the race-winner achieved the fastest lap in only 33% of events (Federation Internationale de l Automobile, 2019). Empirically, the weak correlation between achieving the fastest lap-time and winning suggests that consistent and robust performance is critical to success. In this paper, we investigate this intuition in the setting of autonomous racing (AR). In AR, an AV must lap a racetrack in the presence of other agents deploying unknown policies. The agent wins if it completes the race faster than its opponents; a crash automatically results in a loss. |
| Researcher Affiliation | Academia | 1Stanford University, Stanford, CA, USA 2University of Pennsylvania, Philadelphia, PA, USA 3Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. |
| Pseudocode | Yes | Algorithm 1 AADAPT; Algorithm 2 EXP3 with Nw arm-pulls per iteration |
| Open Source Code | Yes | The hardware specifications, software, and simulator are open-source 2 (see Appendices C and D for details). 2https://github.com/travelbureau/f0_icml_code |
| Open Datasets | No | The paper describes generating data through self-play within a custom simulator built with an Open AI Gym API. It does not provide access to a pre-existing, publicly available dataset in the form of a file or repository, nor does it cite a standard public dataset for its main training. |
| Dataset Splits | No | The paper describes simulated experiments and real-world validation, but it does not specify explicit training, validation, and testing dataset splits (e.g., percentages or counts) as commonly done for fixed datasets. |
| Hardware Specification | No | The paper refers to a 'low-cost 1/10th-scale, Ackermann-steered AV' and 'an embedded processor on board the vehicle' but does not provide specific hardware details such as CPU/GPU models, processor types, or memory amounts used for the experiments in the provided text. |
| Software Dependencies | No | The paper mentions software components such as 'Open AI Gym API', 'masked autoregressive flow (MAF)', 'inverse autogressive flow (IAF)', and 'Adam' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We run AADAPT with L = 5 populations, D = 160 configurations per population, and T = 100 iterations. For vertical MCMC steps, we randomly sample 16 configuratons per population and perform V = 2 iterations of 5 hit-and-run proposals. Furthermore, we perform E = DL2/ t/(L 1) horizontal steps (motivated by the fact fact that tunneling from the highest-temperature level to the coldest takes O(L2) accepted steps). Finally, for training , we use Adam (Kingma & Ba, 2014) with a learning rate of 10 4. For a given robustness level /Nw 2 {0.001, 0.025, 0.2, 0.4, 0.75, 1.0} (with Nw = 8 for all experiments)... |