Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning

Authors: Jean-Bastien Grill, Michal Valko, Remi Munos

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In the present paper, we provide a sample complexity analysis of a new algorithm called Trail Blazer. Our contribution is an algorithm called Trail Blazer whose sampling strategy depends on the specific structure of the MDP and for which we provide sample complexity bounds in terms of a new problem-dependent measure of the quantity of near-optimal nodes.
Researcher Affiliation Collaboration Jean-Bastien Grill Michal Valko Seque L team, INRIA Lille Nord Europe, France jean-bastien.grill@inria.fr michal.valko@inria.fr Rémi Munos Google DeepMind, UK munos@google.com
Pseudocode Yes Figure 1: Trail Blazer; Figure 2: AVG node; Figure 3: MAX node
Open Source Code No The paper does not provide any concrete access to source code for the described methodology.
Open Datasets No The paper is theoretical and analyzes an algorithm without mentioning or using any specific public dataset for training, validation, or testing.
Dataset Splits No The paper does not provide specific dataset split information as it is a theoretical paper focusing on algorithmic analysis rather than empirical evaluation on a specific dataset.
Hardware Specification No The paper does not provide any specific hardware details used for running experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers.
Experiment Setup No The paper does not provide specific experimental setup details such as hyperparameters or training configurations, as it is a theoretical paper.