Generating Language Corrections for Teaching Physical Control Tasks
Authors: Megha Srivastava, Noah Goodman, Dorsa Sadigh
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through both automatic and human evaluations, we show that CORGI can (i) generate valid feedback for novel student trajectories, (ii) outperform baselines on domains with novel control dynamics, and (iii) improve student learning in an interactive drawing task. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Stanford University 2Department of Psychology, Stanford University. |
| Pseudocode | Yes | Algorithm 1 Train CORGI |
| Open Source Code | Yes | We include information about accessing our dataset, model checkpoints, and user study infrastructure at this link: https://github.com/Stanford-ILIAD/corgi. |
| Open Datasets | Yes | DRAWING: ...from the Omniglot dataset (Lake et al., 2015). STEERING: ...the Parking environment from Leurent (2018)... MOVEMENT: ...from the BABEL dataset (Punnakkal et al., 2021) of 3D human motion |
| Dataset Splits | Yes | We split our training dataset into train and valid splits, and use the latter to perform early stopping. |
| Hardware Specification | Yes | The trajectory encoder Mtraj,θ part of CORGI is trained for 200 epochs on one NVIDIA A40 GPU |
| Software Dependencies | Yes | The frozen LM we use is the 124M-parameter version of GPT-2 from Wolf et al. (2019a). ... partially-trained Soft Actor-Critic agents trained for only 100 epochs using the Stable Baselines3 implementation |
| Experiment Setup | Yes | The trajectory encoder Mtraj,θ part of CORGI is trained for 200 epochs on one NVIDIA A40 GPU with a batch size of 64 and learning rate of 0.05... We set the parameter n for Mtraj,θ to be 20, so the trajectory encoder outputs a set of 20 vectors with dimension 768. Mtraj,θ is a 3-layer feed-foward neural network, where each layer has an output size of n = 20 × 768. |