Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation

Authors: Matthew O'Kelly, Aman Sinha, Hongseok Namkoong, Russ Tedrake, John C. Duchi

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our framework on a highway scenario, accelerating system evaluation by 2-20 times over naive Monte Carlo sampling methods and 10-300P times (where P is the number of processors) over real-world testing.
Researcher Affiliation Academia Matthew O Kelly University of Pennsylvania EMAIL Aman Sinha Stanford University EMAIL Hongseok Namkoong Stanford University EMAIL John Duchi Stanford University EMAIL Russ Tedrake Massachusetts Institute of Technology EMAIL
Pseudocode Yes Algorithm 1 Cross-Entropy Method
Open Source Code No The paper refers to 'our open-source toolchain' and 'open-source framework', implying the code's nature, but does not provide a direct link or an explicit statement about releasing the code for the described methodology.
Open Datasets Yes Using the highway traffic dataset NGSim [36], we train policies of human drivers via imitation learning... We use public traffic data collected by the US Department of Transportation [36].
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits for reproducibility in the standard sense (e.g., percentages or sample counts for different subsets of a fixed dataset).
Hardware Specification No The paper mentions 'distributed among available CPUs and GPUs' and 'heterogeneous GPU compute clusters', but does not provide specific hardware details such as GPU or CPU models.
Software Dependencies Yes Using the asynchronous messaging library Zero MQ [21], our implementation is fully-distributed among available CPUs and GPUs; our rollouts are up to 30P times faster than real time, where P is the number of processors... which uses Unreal Engine 4 [17].
Experiment Setup Yes We fix the number of iterations at K = 100, number of samples taken per iteration at Nk = 5000, step size for updates at αk = 0.8, and γ = 0.14.