Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Language-Conditioned Imitation Learning for Robot Manipulation Tasks
Authors: Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Stefan Lee, Chitta Baral, Heni Ben Amor
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our model in a dynamic-enabled simulator with random assortments of objects and procedurally generated instructions, with success in 84% of sequential tasks that required picking up a cup and pouring its contents into another vessel. |
| Researcher Affiliation | Collaboration | Simon Stepputtis 1 Joseph Campbell1 Mariano Phielipp2 Stefan Lee3 Chitta Baral1 Heni Ben Amor1 1Arizona State University, 2Intel AI Labs, 3Oregon State University |
| Pseudocode | No | The paper describes the methods in text and uses equations but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All data used in this paper, along with a trained model and the full source code can be found at: https://github.com/ir-lab/Language Policies. |
| Open Datasets | Yes | All data used in this paper, along with a trained model and the full source code can be found at: https://github.com/ir-lab/Language Policies. The final data set contained 22,500 complete task demonstrations composed of the two subtasks (grasping and pouring), resulting in 45,000 training samples. |
| Dataset Splits | Yes | Of these samples, we used 4,000 for validation and 1,000 for testing, leaving 40,000 for training. |
| Hardware Specification | No | The paper does not explicitly specify the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like Faster R-CNN, ResNet-101, GloVe, and Coppelia Sim but does not specify their version numbers or other ancillary software dependencies with versions. |
| Experiment Setup | Yes | The overall loss was a weighted sum of five auxiliary losses: L = αa La + αt Lt + αφLφ + αw Lw + α L . Values αa = 1, αt = 5, αφ = 1, αw = 50, α = 14 were empirically chosen as hyper-parameters for L that had been found by a grid-search approach. We trained our model in a supervised fashion by minimizing L with an Adam optimizer using a learning rate of 0.0001. |