Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
Authors: Mikael Henaff, Alfredo Canziani, Yann LeCun
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach using a large-scale observational dataset of driving behavior recorded from traffic cameras, and show that we are able to learn effective driving policies from purely observational data, with no environment interaction. |
| Researcher Affiliation | Collaboration | Mikael Henaff Courant Institute, New York University Microsoft Research, NYC mbh305@nyu.edu Alfredo Canziani Courant Institute, New York University canziani@nyu.edu Yann Le Cun Courant Institute, New York University Facebook AI Research yann@cs.nyu.edu |
| Pseudocode | No | The paper describes algorithms and training steps in prose and uses diagrams (Figure 2, 3, 10), but no formal pseudocode blocks or algorithms labeled as such. |
| Open Source Code | Yes | Code and additional video results for the model predictions and learned policies can be found at the following URL: https://sites.google.com/view/model-predictive-driving/home. |
| Open Datasets | Yes | The Next Generation Simulation program s Interstate 80 (NGSIM I-80) dataset (Halkias & Colyar, 2006) consists of 45 minutes of recordings from traffic cameras mounted over a stretch of highway. |
| Dataset Splits | Yes | This yields a total 5596 car trajectories, which we split into training (80%), validation (10%) and testing sets (10%). |
| Hardware Specification | No | No specific hardware details (e.g., CPU, GPU model numbers, memory) were found in the paper. |
| Software Dependencies | No | The paper mentions 'Open AI Gym (Brockman et al., 2016)', 'Adam (Kingma & Ba, 2014)', 'Proximal Policy Optimization (PPO) (Schulman et al., 2017)', and 'Open AI Baselines'. It does not provide specific version numbers for any of these, nor for Python or PyTorch. |
| Experiment Setup | Yes | Our model was trained using Adam (Kingma & Ba, 2014) with learning rate 0.0001 and minibatches of size 64, unrolled for 20 time steps, and with dropout (pdropout = 0.1) at every layer, which was necessary for computing the epistemic uncertainty cost when training the policy network. |