Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching
Authors: Kai Yan, Alex Schwing, Yu-Xiong Wang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we find that PW-DICE improves upon several state-of-the-art methods. |
| Researcher Affiliation | Academia | 1The Grainger College of Engineering, University of Illinois Urbana-Champaign, Urbana, Illinois, USA. |
| Pseudocode | Yes | Algorithm 1 PW-DICE |
| Open Source Code | Yes | The code is available at https: //github.com/Kai Yan289/PW-DICE. |
| Open Datasets | Yes | SMODICE uses a single trajectory (1000 states) from the expert-v2 dataset in D4RL (Fu et al., 2020b) as the expert dataset E. |
| Dataset Splits | No | The paper describes dataset usage for training and testing, and mentions batch sizes and training lengths, but does not provide explicit details on train/validation/test dataset splits, such as percentages or specific sample counts for each split. |
| Hardware Specification | Yes | All experiments are carried out with a single NVIDIA RTX 2080Ti GPU on an Ubuntu 18.04 server with 72 Intel Xeon Gold 6254 CPUs @ 3.10GHz. |
| Software Dependencies | No | The paper mentions software like CVXPY, Gurobi, MOSEK, Open AI gym, and D4RL, but it does not specify version numbers for these or other key software components (e.g., Python, PyTorch) required to ensure reproducibility. |
| Experiment Setup | Yes | Tab. 1 summarizes our hyperparameters, which are also the hyperpameters of plain Behavior Cloning if applicable. For baselines (SMODICE, Lobs DICE, ORIL, OTR, and DWBC), we use the hyperparameters reported in their paper (unless the hyperparameter values in the paper and the code differ, in which case we report the values from the code). |