Learning Compound Tasks without Task-specific Knowledge via Imitation and Self-supervised Learning

Authors: Sang-Hyun Lee, Seung-Woo Seo

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method against several baselines on compound tasks. The results show that our method achieves state-of-the-art performance on compound tasks, outperforming prior imitation learning methods.
Researcher Affiliation Collaboration 1Thor Drive, Seoul, South Korea 2Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea.
Pseudocode No The paper describes its method in text and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link indicating the public availability of its source code.
Open Datasets Yes We conduct experiments on compound tasks from the Ope n AI Gym benchmark suites (Brockman et al., 2016). In addition to these provided tasks, we also introduce new compound tasks Mountain Toy Car and Mountain Toy Car Continuous that are variants of classic control tasks from the Open AI Gym.
Dataset Splits No The paper does not provide specific details on how the demonstrations or data are split into training, validation, and test sets, such as percentages or sample counts.
Hardware Specification Yes All of the experiments were performed on a PC with a 3.60 GHz Intel Core i7-9700K Processor, and a Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions software like "Open AI Gym" and "Mu Jo Co physics engine" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No The paper states "λ1 is the hyperparameter for the regularization term and λ2 is the hyperparameter for the entropy term." and mentions "Appendix C contains further details on our experimental setup." but the provided text does not include specific hyperparameter values or detailed training configurations.