Program Guided Agent

Authors: Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on a 2D Minecraft environment not only demonstrate that the proposed framework learns to reliably accomplish program instructions and achieves zero-shot generalization to more complex instructions but also verify the efficiency of the proposed modulation mechanism for learning the multitask policy.
Researcher Affiliation Academia Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim University of Southern California {shaohuas,telinwu,limjj}@usc.edu
Pseudocode Yes Algorithm 1 Program Execution
Open Source Code No The paper does not provide a concrete link to its source code or explicitly state that its code is being released.
Open Datasets No The paper describes generating its own program sets and collecting natural language translations but does not provide concrete access (e.g., a URL, DOI, or specific citation for public access) to these datasets.
Dataset Splits Yes We sample 4,500 programs using our DSL and split them into 4,000 training programs (train) and 500 testing programs (test). To examine the framework s ability to generalize to more complex instructions, we generate 500 programs which are twice longer and contains more condition branches on average to construct a harder testing set (test-complex).
Hardware Specification Yes We train all our models on a single Nvidia Titan-X GPU, in a 40 core Ubuntu 16.04 Linux server.
Software Dependencies No The paper mentions "TensorFlow (Abadi et al., 2016)" and "Glo Ve Pennington et al. (2014) (50-D version)" but does not provide specific version numbers for the TensorFlow library or other key software components used for replication.
Experiment Setup Yes We use the following hyperparameters to train A2C agents for our model and all the end-to-end learning models: learning rate: 1 10 3, number of environment: 64, number of workers: 64, and number of update roll-out steps: 5.