Value Iteration Networks
Authors: Aviv Tamar, YI WU, Garrett Thomas, Sergey Levine, Pieter Abbeel
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We demonstrate the effectiveness of VINs within standard RL and IL algorithms in various problems, among which require visual perception, continuous control, and also natural language based decision making in the Web Nav challenge [23]. After training, the policy learns to map an observation to a planning computation relevant for the task, and generate action predictions based on the resulting plan. As we demonstrate, this leads to policies that generalize better to new, unseen, task instances. |
| Researcher Affiliation | Academia | Dept. of Electrical Engineering and Computer Sciences, UC Berkeley |
| Pseudocode | No | The paper includes architectural diagrams (Figure 2) but does not provide explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code is available at https://github.com/avivt/VIN. |
| Open Datasets | No | The paper describes experiments on a "synthetic grid-world", "Mars landscape" images, and the "Wikipedia for Schools website". While these domains are mentioned, the paper does not provide explicit links, DOIs, or citations with author/year for public access to the specific datasets used for training or testing. |
| Dataset Splits | No | The paper mentions a "held-out test-set" but does not explicitly provide details about a validation set split or its size/proportion. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU models, memory) used for running the experiments. It only mentions using Mujoco for physical simulation. |
| Software Dependencies | No | The paper mentions "Theano [28]" but does not specify a version number for Theano or any other software libraries or dependencies. It also mentions "Mujoco [29]" and "publicly available GPS code [7]" but without specific versions for these tools. |
| Experiment Setup | No | The paper describes high-level design choices for the VIN (e.g., K recurrence, fR as CNN, attention module, pre-training with discounted grid-world transitions) and compares with other network architectures. However, it does not provide specific numerical hyperparameters (e.g., learning rate, batch size, number of epochs, specific optimizer settings) used during training. |