Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Program Guided Agent
Authors: Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on a 2D Minecraft environment not only demonstrate that the proposed framework learns to reliably accomplish program instructions and achieves zero-shot generalization to more complex instructions but also verify the ef๏ฌciency of the proposed modulation mechanism for learning the multitask policy. |
| Researcher Affiliation | Academia | Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim University of Southern California EMAIL |
| Pseudocode | Yes | Algorithm 1 Program Execution |
| Open Source Code | No | The paper does not provide a concrete link to its source code or explicitly state that its code is being released. |
| Open Datasets | No | The paper describes generating its own program sets and collecting natural language translations but does not provide concrete access (e.g., a URL, DOI, or specific citation for public access) to these datasets. |
| Dataset Splits | Yes | We sample 4,500 programs using our DSL and split them into 4,000 training programs (train) and 500 testing programs (test). To examine the framework s ability to generalize to more complex instructions, we generate 500 programs which are twice longer and contains more condition branches on average to construct a harder testing set (test-complex). |
| Hardware Specification | Yes | We train all our models on a single Nvidia Titan-X GPU, in a 40 core Ubuntu 16.04 Linux server. |
| Software Dependencies | No | The paper mentions "TensorFlow (Abadi et al., 2016)" and "Glo Ve Pennington et al. (2014) (50-D version)" but does not provide specific version numbers for the TensorFlow library or other key software components used for replication. |
| Experiment Setup | Yes | We use the following hyperparameters to train A2C agents for our model and all the end-to-end learning models: learning rate: 1 10 3, number of environment: 64, number of workers: 64, and number of update roll-out steps: 5. |