Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems
Authors: Aman Sinha, Matthew O'Kelly, Russ Tedrake, John C. Duchi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate the efficacy of our approach on a variety of scenarios, illustrating its usefulness as a tool for rapid sensitivity analysis and model comparison that are essential to developing and testing safety-critical autonomous systems. A major focus of this work is empirical, and accordingly, Section 4 empirically demonstrates the superiority of neural bridge sampling over competing techniques in a variety of applications. |
| Researcher Affiliation | Academia | Aman Sinha Stanford University amans@stanford.edu Matthew O Kelly University of Pennsylvania mokelly@seas.upenn.edu Russ Tedrake Massachusetts Institute of Technology russt@mit.edu John Duchi Stanford University jduchi@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Neural bridge sampling |
| Open Source Code | No | No explicit statement or link found regarding the release of open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate a formally-verified neural network controller [48] on the Open AI Gym continuous Mountain Car environment [67, 17] under a domain perturbation. ... comparing two algorithms on the Open AI Gym Car Racing environment (which requires a surrogate model for gradients) [55]. |
| Dataset Splits | No | The paper discusses simulation environments and probability distributions for defining scenarios, but does not provide specific training/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for running the experiments. |
| Software Dependencies | No | The paper mentions software components like 'Open AI Gym' and 'masked autoregressive flows (MAFs)' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | All methods are given the same computational budget as measured by evaluations of the simulator. This varies from 50,000-100,000 queries to run Algorithm 1 as determined by pγ (see Appendix C for details of each experiment s hyperparameters). |