Query-Efficient Imitation Learning for End-to-End Simulated Driving

Authors: Jiakai Zhang, Kyunghyun Cho

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed Safe DAgger in a car racing simulator and show that it indeed requires less queries to a reference policy. We observe a significant speed up in convergence, which we conjecture to be due to the effect of automated curriculum learning.
Researcher Affiliation Academia Jiakai Zhang Department of Computer Science New York University zhjk@nyu.eduKyunghyun Cho Department of Computer Science Center for Data Science New York University kyunghyun.cho@nyu.edu
Pseudocode Yes Algorithm 1 Safe DAgger Blue fonts are used to highlight the differences from the vanilla DAgger.
Open Source Code No We will release a patch to TORCS that allows seamless integration between TORCS and Torch.
Open Datasets Yes We use TORCS (tor accessed May 12 2016), a racing car simulator, for empirical evaluation in this paper. We chose TORCS based on the following reasons. First, it has been used widely and successfully as a platform for research on autonomous racing (Loiacono et al. 2008)... Third, as TORCS is an open-source software, it is easy to interface it with another software which is Torch in our case.
Dataset Splits Yes We use ten different tracks in total for our experiments. We split those ten tracks into two disjoint sets: seven training tracks and three test tracks. All training examples as well as validation examples are collected from the training tracks only, and a trained primary policy is tested on the test tracks.
Hardware Specification No The paper mentions running on an 'off-the-shelf workstation' but does not provide specific details such as GPU or CPU models, or memory specifications.
Software Dependencies No The paper mentions using 'Torch' but does not specify a version number for this software dependency. It only mentions TORCS without a version.
Experiment Setup Yes In the case of Safe DAgger, we collect 30k, 30k and 10k of training examples (after the subset selection in line 6 of Alg. 1.) In the case of the original DAgger, we collect up to 390k data each iteration and uniform-randomly select 30k, 30k and 10k samples as our training examples... We use a deep convolutional network that has five convolutional layers followed by a set of fullyconnected layers... We choose τ = 0.0025 as our safety classifier threshold so that approximately 20% of initial training examples are considered safe.