Provable Representation Learning for Imitation Learning via Bi-level Optimization
Authors: Sanjeev Arora, Simon Du, Sham Kakade, Yuping Luo, Nikunj Saunshi
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also provide proof-of-concept experiments to verify our theory. We conduct experiments in both settings to verify our theoretical insights by learning a representation from multiple tasks using our framework and testing it on a new task from the same setting. |
| Researcher Affiliation | Academia | 1Princeton University, Princeton, New Jersey, USA 2Institute for Advanced Study, Princeton, New Jersey, USA 3University of Washington, Seattle, Washington, USA. |
| Pseudocode | No | The paper describes the proposed framework and methods mathematically and in prose, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using 'OpenAI Baselines' (Dhariwal et al., 2017) for policy training, but it does not state that the authors' own implementation code for the methodology described in the paper is openly available, nor does it provide a link. |
| Open Datasets | No | The paper describes custom environments 'Noisy Combination Lock' and 'Swimmer Velocity' for experiments but does not provide specific access information, citations to public datasets, or repositories for any training data used. |
| Dataset Splits | No | The paper mentions training on 'train task data' and testing on 'a new task ยต' but does not provide specific details on how datasets are split into training, validation, and test sets (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | Yes | Our experiments are conducted on an NVIDIA RTX 2080 Ti. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'OpenAI Baselines' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We use Adam optimizer (Kingma & Ba, 2014) with learning rate 0.001. |