Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

Authors: Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. We evaluate our method on three sets of experiments: (1) a 2D navigation task, (2) simulated robot manipulation tasks in Robo Suite (Zhu et al., 2020), and (3) a real-robot implementation of the 2D navigation and a marble-scooping task.
Researcher Affiliation Academia Yanwei Wang , Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah MIT CSAIL Corresponding author: yanwei@mit.edu
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No Website: https://yanweiw.github.io/glide/ (This is a project website, not an explicit code repository link or statement about code release). The paper does not explicitly state "We release our code..." or "The source code is available at...".
Open Datasets Yes We evaluate our method on three sets of experiments: (1) a 2D navigation task, (2) simulated robot manipulation tasks in Robo Suite (Zhu et al., 2020), and (3) a real-robot implementation of the 2D navigation and a marble-scooping task.
Dataset Splits No No explicit details on train/test/validation dataset splits (percentages, counts, or explicit splitting methodology) were found. The paper mentions using 'fewer than 10 successful demonstrations for classifier learning and policy learning' and discusses 'test' performance, but not how data was partitioned into training, validation, and test sets.
Hardware Specification No No specific hardware specifications (e.g., GPU/CPU models, memory details) used for running experiments were found in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., library names with versions) were found in the paper.
Experiment Setup No The paper mentions 'fewer than 10 successful demonstrations' and refers to 'hyperparameters for balancing loss terms' but does not provide specific values for these hyperparameters (e.g., learning rate, batch size, epochs) or detailed training configurations in the main text.