reproducibilityindex.ai

Third Person Imitation Learning

Authors: Bradly C Stadie, Pieter Abbeel, Ilya Sutskever

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate our approach, we report successful experiments on learning from third-person demonstrations in a pointmass domain, a reacher domain, and inverted pendulum.
Researcher Affiliation	Collaboration	1 Open AI 2 UC Berkeley, Department of Statistics 3 UC Berkeley, Departments of EECS and ICSI
Pseudocode	Yes	The entire process is summarized in algorithm 1.
Open Source Code	Yes	Code to train a third person imitation learning agent on the domains from this paper is presented here: https://github.com/bstadie/third_person_im
Open Datasets	No	The paper uses environments from the MuJoCo physics simulator (pointmass, reacher, inverted pendulum) but does not provide specific access information (links, citations, or repository names) for these environments as datasets or publicly available resources.
Dataset Splits	No	The paper does not specify exact training, validation, and test split percentages or sample counts for any dataset used.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies	No	ADAM is used for discriminator training with a learning rate of 0.001. The RL generator uses the off-the-shelf TRPO implementation available in RLLab. While 'ADAM' and 'RLLab' are mentioned, specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	Joint Feature Extractor: Input is images are size 50 x 50 with 3 channels, RGB. Layers are 2 convolutional layers each followed by a max pooling layer of size 2. Layers use 5 ﬁlters of size 3 each. ... ADAM is used for discriminator training with a learning rate of 0.001. ... a value of 4 showed good performance over all tasks, and so this value was utilized in all other experiments.