Interactive Learning from Activity Description
Authors: Khanh X Nguyen, Dipendra Misra, Robert Schapire, Miroslav Dudik, Patrick Shafto
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results in two challenging request-fulfilling problems demonstrate the strengths of our approach: compared with RL baselines, it is more sample-efficient; compared with IL baselines, it achieves competitive success rates without requiring the teaching agent to be able to demonstrate the desired behavior using the learning agent s actions. Apart from empirical evaluation, we also provide theoretical guarantees for our algorithm under certain assumptions about the teacher and the environment. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of Maryland, Maryland, USA 2Microsoft Research, New York, USA 3Rutgers University, New Jersey, USA. |
| Pseudocode | Yes | Algorithm 1 ILIAD protocol. ... Algorithm 2 Simple algorithm for learning an agent s policy... Algorithm 3 ADEL: our implementation of the ILIAD protocol. |
| Open Source Code | Yes | The code of our experiments is available at https://github.com/khanhptnk/iliad. |
| Open Datasets | Yes | We empirically evaluate ADEL against IL and RL baselines on two tasks: vision-language navigation (Anderson et al., 2018), and word-modification via regular expressions (Andreas et al., 2018). |
| Dataset Splits | Yes | Each of the two experimented problems is accompanied by data that is partitioned into training/validation/test splits. We use the training split as Bsim and use the other two splits for validation and testing, respectively. |
| Hardware Specification | No | The paper states: 'We also thank Microsoft GCR team for providing computational resources.' This is a general statement about resources and does not provide specific hardware details (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper mentions 'RNN-based conditional language model' and 'regular expression compiler' but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | No | The paper states: 'Details about the data, the model architecture, training hyperparameters, and how the teacher is simulated in each problem are in the Appendix.' However, these details are not provided in the main text of the paper. |