reproducibilityindex.ai

Adversarial Imitation Learning with Preferences

Authors: Aleksandar Taranovic, Andras Gabor Kupcsik, Niklas Freymuth, Gerhard Neumann

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally validate the effectiveness of combining both preferences and demonstrations on common benchmarks and also show that our method can efficiently learn challenging robot manipulation tasks.
Researcher Affiliation	Collaboration	Aleksandar Taranovic1,2 , Andras Kupcsik2, Niklas Freymuth1, Gerhard Neumann1 1 Autonomous Learning Robots Lab, Karlsruhe Institute of Technology, Karlsruhe, Germany 2 Bosch Center for Artificial Intelligence, Renningen, Germany
Pseudocode	Yes	we additionally provide pseudocode in Appendix B. and The AILP algorithm with all individual steps in shown in Alg.1 below.
Open Source Code	No	The paper only provides a link (https://github.com/pokaxpoka/B_Pref) to the official implementation of a baseline method (Pebble) that they compare against, stating 'We use the official implementation which is contained in the same code repository1 as for (Lee et al., 2021b).'. It does not explicitly state that the source code for their proposed method (AILP) is publicly available.
Open Datasets	Yes	We consider 6 different manipulation tasks from the metaworld benchmark (Yu et al., 2019)... and Furthermore, we also evaluate the performance in a Mujoco task, Half Cheetah (Todorov et al., 2012).
Dataset Splits	No	The paper does not explicitly provide details about training/validation/test dataset splits. It describes using expert demonstrations and generating samples for training, but does not specify how the overall dataset is partitioned into distinct validation sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cluster specifications) used for running the experiments.
Software Dependencies	No	The paper mentions software components like Soft Actor-Critic (SAC) and the Adam optimizer but does not provide specific version numbers for these or any other software dependencies (e.g., Python, PyTorch, TensorFlow, or specific libraries).
Experiment Setup	Yes	In all experiments we use the same set of 10 random seeds. and In all evaluated experiments in Section 5 we use the same parameters for SAC and those are listed in Table 2.