Active Exploration for Inverse Reinforcement Learning

Authors: David Lindner, Andreas Krause, Giorgia Ramponi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate Ace IRL in simulations and find that it significantly outperforms more naive exploration strategies.
Researcher Affiliation Academia David Lindner Department of Computer Science ETH Zurich david.lindner@inf.ethz.ch Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch Giorgia Ramponi ETH AI Center giorgia.ramponi@ai.ethz.ch
Pseudocode Yes Algorithm 1 Ace IRL algorithm for IRL in an unknown environment.
Open Source Code Yes We provide code to reproduce out experiments at https://github.com/lasgroup/aceirl.
Open Datasets No The paper describes using simulated environments (Four Paths, Random MDPs, Double Chain, Chain, Gridworld), some of which are based on prior work (Kaufmann et al., 2021; Metelli et al., 2021). However, it does not provide concrete access information (e.g., links, DOIs, or specific citations to publicly available static datasets) for these simulated environments as 'datasets'.
Dataset Splits No The paper does not explicitly provide details about training, validation, or test dataset splits. The experiments are conducted in simulated environments based on sample complexity and episodes, rather than fixed dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper cites external tools like CVXPY and Conic optimization libraries in its references but does not provide specific version numbers for any ancillary software dependencies used in their implementation (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup No The paper describes the simulated environments and high-level algorithmic components but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific training configurations in the main text.