reproducibilityindex.ai

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

Authors: Chelsea Finn, Sergey Levine, Pieter Abbeel

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on a series of simulated tasks and real-world robotic manipulation problems, demonstrating substantial improvement over prior methods both in terms of task complexity and sample efﬁciency.
Researcher Affiliation	Academia	University of California, Berkeley, Berkeley, CA 94709 USA
Pseudocode	Yes	Algorithm 1 Guided cost learning
Open Source Code	No	The paper provides a link to a video ('http://rll.berkeley.edu/gcl') but no explicit statement or link for open-source code for the methodology.
Open Datasets	No	The paper mentions using 'expert demonstrations' or 'human demonstrations' as data, and describes how they were generated or provided ('Between 20 and 32 demonstrations were generated...', 'between 25 and 30 human demonstrations were provided via kinesthetic teaching'), but it does not provide concrete access information (link, DOI, repository, or citation to an established public dataset) for these demonstrations.
Dataset Splits	No	The paper mentions 'demonstrations' and 'test states' or 'test condition' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing.
Hardware Specification	No	The paper mentions using a 'PR2 robot' and 'MuJoCo physics simulator' but does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions the 'MuJoCo physics simulator' and implies the use of 'neural network libraries based on backpropagation' but does not provide specific software names with version numbers.
Experiment Setup	Yes	We used a neural network cost function with two hidden layers with 24 52 units and rectifying nonlinearities of the form max(z, 0) followed by linear connections to a set of features yt, which had a size of 20 for the 2D navigation task and 100 for the other two tasks. The cost is then given by cθ(xt, ut) = Ayt + b 2 + wu ut 2 (2) with a ﬁxed torque weight wu and the parameters consisting of A, b, and the network weights.