Adversarial Imitation Learning via Boosting
Authors: Jonathan Daniel Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we evaluate our algorithm on both controller state-based and pixel-based environments from the Deep Mind Control Suite. AILBoost outperforms DAC on both types of environments, demonstrating the benefit of properly weighting replay buffer data for off-policy training. |
| Researcher Affiliation | Academia | Jonathan D. Chang Department of Computer Science Cornell University jdc396@cornell.edu Dhruv Sreenivas Department of Computer Science Cornell University ds844@cornell.edu Yingbing Huang Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign yh21@illinois.edu Kianté Brantley Department of Computer Science Cornell University kdb82@cornell.edu Wen Sun Department of Computer Science Cornell University ws455@cornell.edu |
| Pseudocode | Yes | Algorithm 1 AILBOOST (Adversarial Imitation Learning via Boosting) |
| Open Source Code | No | The paper does not contain an unambiguous statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We evaluate AILBoost on the Deep Mind Control Suite (Tassa et al., 2018): Walker Walk, Cheetah Run, Ball in Cup Catch, Quadruped Walk, and Humanoid Stand. For each game, we train an expert RL agent using the environment s reward and collect 10 demonstrations which we use as the expert dataset throughout our experiments. |
| Dataset Splits | No | The paper discusses training and evaluation but does not explicitly detail a separate validation dataset split with specific percentages, counts, or methodologies. |
| Hardware Specification | No | The paper mentions using specific RL algorithms (SAC, Dr Q-v2) for training but does not provide any specific hardware details such as CPU/GPU models, memory, or cloud computing instances used for running its experiments. |
| Software Dependencies | No | The paper mentions using SAC and Dr Q-v2 (referencing 'pytorch_sac' in the bibliography) but does not provide specific version numbers for software components like Python, PyTorch, or other libraries that would be necessary for a reproducible setup. |
| Experiment Setup | Yes | Table 4: Hyperparameters used for DAC and AILBoost. All of DAC s hyperparameters are shared by AILBoost except for the parameters colored in blue. In particular, the update frequency of the disciminrator vs the policy is one of the core differences between DAC and AILBoost. |