Query-Efficient Imitation Learning for End-to-End Simulated Driving
Authors: Jiakai Zhang, Kyunghyun Cho
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed Safe DAgger in a car racing simulator and show that it indeed requires less queries to a reference policy. We observe a significant speed up in convergence, which we conjecture to be due to the effect of automated curriculum learning. |
| Researcher Affiliation | Academia | Jiakai Zhang Department of Computer Science New York University zhjk@nyu.eduKyunghyun Cho Department of Computer Science Center for Data Science New York University kyunghyun.cho@nyu.edu |
| Pseudocode | Yes | Algorithm 1 Safe DAgger Blue fonts are used to highlight the differences from the vanilla DAgger. |
| Open Source Code | No | We will release a patch to TORCS that allows seamless integration between TORCS and Torch. |
| Open Datasets | Yes | We use TORCS (tor accessed May 12 2016), a racing car simulator, for empirical evaluation in this paper. We chose TORCS based on the following reasons. First, it has been used widely and successfully as a platform for research on autonomous racing (Loiacono et al. 2008)... Third, as TORCS is an open-source software, it is easy to interface it with another software which is Torch in our case. |
| Dataset Splits | Yes | We use ten different tracks in total for our experiments. We split those ten tracks into two disjoint sets: seven training tracks and three test tracks. All training examples as well as validation examples are collected from the training tracks only, and a trained primary policy is tested on the test tracks. |
| Hardware Specification | No | The paper mentions running on an 'off-the-shelf workstation' but does not provide specific details such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'Torch' but does not specify a version number for this software dependency. It only mentions TORCS without a version. |
| Experiment Setup | Yes | In the case of Safe DAgger, we collect 30k, 30k and 10k of training examples (after the subset selection in line 6 of Alg. 1.) In the case of the original DAgger, we collect up to 390k data each iteration and uniform-randomly select 30k, 30k and 10k samples as our training examples... We use a deep convolutional network that has five convolutional layers followed by a set of fullyconnected layers... We choose τ = 0.0025 as our safety classifier threshold so that approximately 20% of initial training examples are considered safe. |