InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
Authors: Yunzhu Li, Jiaming Song, Stefano Ermon
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the performance of our method by applying it first to a synthetic 2D example and then in a challenging driving domain where the agent is imitating driving behaviors from visual inputs. By conducting experiments on these two environments, we show that our learned policy can 1) imitate expert behaviors using high-dimensional inputs with only a small number of expert demonstrations, 2) cluster expert behaviors into different and semantically meaningful categories, and 3) reproduce different categories of behaviors by setting the high-level latent variables appropriately. |
| Researcher Affiliation | Academia | MIT liyunzhu@mit.edu Jiaming Song Stanford University tsong@cs.stanford.edu Stefano Ermon Stanford University ermon@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 Info GAIL |
| Open Source Code | Yes | The code for reproducing the experiments are available at https://github.com/ermongroup/Info GAIL. |
| Open Datasets | No | The paper mentions using demonstrations collected by manually driving in the TORCS simulator and references TORCS as "The Open Source Racing Car Simulator, [15]". While TORCS itself is open-source software, the paper does not provide concrete access information (e.g., link, DOI, citation for dataset) for the specific expert demonstrations/dataset used in their experiments. |
| Dataset Splits | No | The paper mentions "80 expert trajectories in total, with 100 frames in each trajectory" for some experiments, but it does not specify how these trajectories are split into training, validation, and test sets. There is no explicit mention of split percentages, absolute counts, or a specific splitting methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. It only mentions general computing environments indirectly, e.g., for visual inputs or computation. |
| Software Dependencies | No | The paper mentions using specific optimization algorithms and networks like RMSprop, Adam optimizer [23], TRPO [2], and Deep Residual Network [31] pre-trained on Image Net [32], but it does not provide specific version numbers for any software libraries or frameworks (e.g., TensorFlow, PyTorch, scikit-learn, etc.). |
| Experiment Setup | No | The paper mentions hyperparameters λ0, λ1, and λ2, stating they are > 0, but does not provide their specific numerical values. It describes the general nature of some inputs (e.g., 10-dimensional auxiliary input, one-hot encoded latent code) and high-level network architectures, but lacks concrete details such as learning rates, batch sizes, number of epochs, or specific layer configurations (e.g., number of units, activation functions, kernel sizes). |