reproducibilityindex.ai

Active Exploration for Inverse Reinforcement Learning

Authors: David Lindner, Andreas Krause, Giorgia Ramponi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate Ace IRL in simulations and find that it significantly outperforms more naive exploration strategies.
Researcher Affiliation	Academia	David Lindner Department of Computer Science ETH Zurich david.lindner@inf.ethz.ch Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch Giorgia Ramponi ETH AI Center giorgia.ramponi@ai.ethz.ch
Pseudocode	Yes	Algorithm 1 Ace IRL algorithm for IRL in an unknown environment.
Open Source Code	Yes	We provide code to reproduce out experiments at https://github.com/lasgroup/aceirl.
Open Datasets	No	The paper describes using simulated environments (Four Paths, Random MDPs, Double Chain, Chain, Gridworld), some of which are based on prior work (Kaufmann et al., 2021; Metelli et al., 2021). However, it does not provide concrete access information (e.g., links, DOIs, or specific citations to publicly available static datasets) for these simulated environments as 'datasets'.
Dataset Splits	No	The paper does not explicitly provide details about training, validation, or test dataset splits. The experiments are conducted in simulated environments based on sample complexity and episodes, rather than fixed dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper cites external tools like CVXPY and Conic optimization libraries in its references but does not provide specific version numbers for any ancillary software dependencies used in their implementation (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup	No	The paper describes the simulated environments and high-level algorithmic components but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific training configurations in the main text.