Distributionally Robust Imitation Learning
Authors: Mohammad Ali Bashiri, Brian Ziebart, Xinhua Zhang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show the significant benefits of DROIL s new optimization method on synthetic data and a highway driving environment. and 6 Experimental Results In our experiments, we compare DROIL with prior methods on several imitation learning tasks. |
| Researcher Affiliation | Academia | Mohammad Ali Bashiri Brian D. Ziebart Xinhua Zhang Department of Computer Science University of Illinois at Chicago Chicago, IL 60607 {mbashi4, zhangx, bziebart}@uic.edu |
| Pseudocode | Yes | Algorithm 1 Distributionally Robust Imitation Learning (DROIL) |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We provide these in the supplementary materials. |
| Open Datasets | No | The paper describes generating data from simulations (e.g., trajectories are collected from simulated navigation, We generate trajectories from the optimal policy, examples sampled from the stochastic Max Ent IRL policy) but does not provide concrete access information for a publicly available dataset used for training. |
| Dataset Splits | No | The paper discusses training and evaluating, but does not provide specific details on how datasets were split into training, validation, and testing sets (e.g., percentages, sample counts, or references to predefined splits). |
| Hardware Specification | No | The paper does not specify any particular hardware components (e.g., GPU model, CPU model, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) within its main text. |
| Experiment Setup | Yes | The loss function for this experiment is set to E PT t=0 ( ˆXt ˇXt)2 + ( ˆYt ˇYt)2 and We set a uniformly random loss for DROIL and elapsed time in second until convergence with 10 3 tolerance. |