reproducibilityindex.ai

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

Authors: Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. We evaluate our method on three sets of experiments: (1) a 2D navigation task, (2) simulated robot manipulation tasks in Robo Suite (Zhu et al., 2020), and (3) a real-robot implementation of the 2D navigation and a marble-scooping task.
Researcher Affiliation	Academia	Yanwei Wang , Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah MIT CSAIL Corresponding author: yanwei@mit.edu
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	Website: https://yanweiw.github.io/glide/ (This is a project website, not an explicit code repository link or statement about code release). The paper does not explicitly state "We release our code..." or "The source code is available at...".
Open Datasets	Yes	We evaluate our method on three sets of experiments: (1) a 2D navigation task, (2) simulated robot manipulation tasks in Robo Suite (Zhu et al., 2020), and (3) a real-robot implementation of the 2D navigation and a marble-scooping task.
Dataset Splits	No	No explicit details on train/test/validation dataset splits (percentages, counts, or explicit splitting methodology) were found. The paper mentions using 'fewer than 10 successful demonstrations for classifier learning and policy learning' and discusses 'test' performance, but not how data was partitioned into training, validation, and test sets.
Hardware Specification	No	No specific hardware specifications (e.g., GPU/CPU models, memory details) used for running experiments were found in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names with versions) were found in the paper.
Experiment Setup	No	The paper mentions 'fewer than 10 successful demonstrations' and refers to 'hyperparameters for balancing loss terms' but does not provide specific values for these hyperparameters (e.g., learning rate, batch size, epochs) or detailed training configurations in the main text.