The Predictron: End-To-End Learning and Planning
Authors: David Silver, Hado Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We applied the predictron to procedurally generated random mazes and a simulator for the game of pool. The predictron yielded significantly more accurate predictions than conventional deep neural network architectures. |
| Researcher Affiliation | Industry | 1Deep Mind, London. |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides a URL for a video demonstration but does not explicitly provide access to the source code for the methodology described. |
| Open Datasets | No | The paper mentions using "procedurally generated random mazes" and a "simulator for the game of pool" implemented in Mujoco, but it does not provide concrete access information (link, DOI, specific repository, or formal citation for a publicly available dataset) for these generated environments or the specific data used. |
| Dataset Splits | No | The paper mentions training models and evaluating performance but does not specify the exact percentages or counts for training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of the physics engine Mujoco but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | All variants utilise a convolutional core with 2 intermediate hidden layers; parameters were updated by supervised learning (see appendix for more details). |