reproducibilityindex.ai

Guiding Policies with Language via Meta-Learning

Authors: John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments analyze GPL in a partially observed object manipulation environment and a block pushing environment.
Researcher Affiliation	Academia	John D. Co-Reyes Abhishek Gupta Suvansh Sanjeev Nick Altieri Jacob Andreas John De Nero Pieter Abbeel Sergey Levine University of California, Berkeley jcoreyes@eecs.berkeley.edu
Pseudocode	Yes	Algorithm 1: GPL meta-training algorithm.
Open Source Code	No	Our code and supplementary material will be available at https://sites.google.com/view/lgpl/ home
Open Datasets	No	Environments are generated by sampling a goal object color, goal object shape, and goal square color which are placed at random locations in different random rooms.
Dataset Splits	No	We train on 1700 of these environments and reserve a separate set for testing.
Hardware Specification	No	The paper mentions 'computational resources from Amazon and NVIDIA' in the acknowledgements, but it does not provide specific hardware details like exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	We use Adam for optimization with a learning rate of 0.001.
Experiment Setup	Yes	We use Adam for optimization with a learning rate of 0.001. MLP(32, 32) speciﬁes a multilayer-perceptron with 2 layers each of size 32. CNN((4, 2x2, 1), (4, 2x2, 1)) speciﬁes a 2 layer convolutional neural network where each layer has 4 ﬁlters, 2x2 kernels, and 1 stride. Unless otherwise stated, we use Re LU activations.