reproducibilityindex.ai

Identifiability and Generalizability in Constrained Inverse Reinforcement Learning

Authors: Andreas Schlaginhaufen, Maryam Kamgarpour

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 6, we experimentally verify our results in a gridworld environment.
Researcher Affiliation	Academia	1SYCAMORE Lab, Ecole Polytechnique F ed erale de Lausanne (EPFL), 1015 Lausanne, Switzerland. Correspondence to: Andreas Schlaginhaufen <andreas.schlaginhaufen@epfl.ch>.
Pseudocode	Yes	Algorithm 1 Gradient Descent Ascent for Constrained Entropy-Regularized IRL
Open Source Code	Yes	The code to all our experiments is available at: https://github.com/andrschl/cirl
Open Datasets	Yes	We consider a gridworld environment (Sutton & Barto, 2018)
Dataset Splits	No	The paper mentions varying the number of expert trajectories (N) and trajectory length (T) but does not specify a training/validation/test split for the dataset itself.
Hardware Specification	No	The paper describes the simulated environment but does not provide any specific hardware details such as GPU/CPU models or memory used for running the experiments.
Software Dependencies	Yes	feasibility is checked via the LP solver linprog provided by (Virtanen et al., 2020).
Experiment Setup	Yes	We consider a gridworld environment... with 36 states (the grid cells) and 4 actions (up, down, left, right). The agent has a 90% chance of reaching the desired location when taking an action and a 10% chance of ending up in a random neighboring grid cell. We choose the entropy regularization f(µ) = E(s,a) µ [H (πµ( \|s))]. We use a primal-dual gradient-descent-ascent method... with N {10, 100, 1000, 10000} trajectories of length T = 10000.