Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

Authors: Ben Eysenbach, XINYANG GENG, Sergey Levine, Russ R. Salakhutdinov

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on complex simulated locomotion and manipulation tasks demonstrate that our approach outperforms state-of-the-art multi-task RL methods. ... 6 Experiments: Relabeling with Inverse RL Accelerates Learning
Researcher Affiliation Collaboration φ Carnegie Mellon University ψ UC Berkeley θ Google Brain
Pseudocode Yes Algorithm 1 Approximate Inverse RL. ... Algorithm 2 HIPI-RL: Inverse RL for Off-Policy RL ... Algorithm 3 HIPI-BC: Inverse RL for Behavior Cloning
Open Source Code Yes Full experimental details are included in Appendix E and code has been released.2 https://github.com/google-research/google-research/tree/master/hipi
Open Datasets Yes For the manipulation environment, Lynch et al. [32] provided a dataset of 100 demonstrations for each of these tasks, which we aggregate into a dataset of 900 demonstrations.
Dataset Splits No The paper describes collecting experience and demonstrations, but does not provide specific train/validation/test dataset splits or their percentages/counts.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch 1.9 or TensorFlow 2.x).
Experiment Setup No The paper states 'Full experimental details are included in Appendix E' and 'See Appendix E.2 for hyperparameters', indicating that these details are not in the main text.