Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Authors: Chelsea Finn, Sergey Levine, Pieter Abbeel
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on a series of simulated tasks and real-world robotic manipulation problems, demonstrating substantial improvement over prior methods both in terms of task complexity and sample efficiency. |
| Researcher Affiliation | Academia | University of California, Berkeley, Berkeley, CA 94709 USA |
| Pseudocode | Yes | Algorithm 1 Guided cost learning |
| Open Source Code | No | The paper provides a link to a video ('http://rll.berkeley.edu/gcl') but no explicit statement or link for open-source code for the methodology. |
| Open Datasets | No | The paper mentions using 'expert demonstrations' or 'human demonstrations' as data, and describes how they were generated or provided ('Between 20 and 32 demonstrations were generated...', 'between 25 and 30 human demonstrations were provided via kinesthetic teaching'), but it does not provide concrete access information (link, DOI, repository, or citation to an established public dataset) for these demonstrations. |
| Dataset Splits | No | The paper mentions 'demonstrations' and 'test states' or 'test condition' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing. |
| Hardware Specification | No | The paper mentions using a 'PR2 robot' and 'MuJoCo physics simulator' but does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions the 'MuJoCo physics simulator' and implies the use of 'neural network libraries based on backpropagation' but does not provide specific software names with version numbers. |
| Experiment Setup | Yes | We used a neural network cost function with two hidden layers with 24 52 units and rectifying nonlinearities of the form max(z, 0) followed by linear connections to a set of features yt, which had a size of 20 for the 2D navigation task and 100 for the other two tasks. The cost is then given by cθ(xt, ut) = Ayt + b 2 + wu ut 2 (2) with a fixed torque weight wu and the parameters consisting of A, b, and the network weights. |