Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Authors: Ben Eysenbach, XINYANG GENG, Sergey Levine, Russ R. Salakhutdinov
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on complex simulated locomotion and manipulation tasks demonstrate that our approach outperforms state-of-the-art multi-task RL methods. ... 6 Experiments: Relabeling with Inverse RL Accelerates Learning |
| Researcher Affiliation | Collaboration | φ Carnegie Mellon University ψ UC Berkeley θ Google Brain |
| Pseudocode | Yes | Algorithm 1 Approximate Inverse RL. ... Algorithm 2 HIPI-RL: Inverse RL for Off-Policy RL ... Algorithm 3 HIPI-BC: Inverse RL for Behavior Cloning |
| Open Source Code | Yes | Full experimental details are included in Appendix E and code has been released.2 https://github.com/google-research/google-research/tree/master/hipi |
| Open Datasets | Yes | For the manipulation environment, Lynch et al. [32] provided a dataset of 100 demonstrations for each of these tasks, which we aggregate into a dataset of 900 demonstrations. |
| Dataset Splits | No | The paper describes collecting experience and demonstrations, but does not provide specific train/validation/test dataset splits or their percentages/counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch 1.9 or TensorFlow 2.x). |
| Experiment Setup | No | The paper states 'Full experimental details are included in Appendix E' and 'See Appendix E.2 for hyperparameters', indicating that these details are not in the main text. |