reproducibilityindex.ai

Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning

Authors: Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we study multiple-step greedy algorithms in more practical setups. We begin by highlighting a counter-intuitive difﬁculty... we formulate and analyze online and approximate algorithms that use such a multi-step greedy operator. and a next indisputable step would be to empirically evaluate implementations of the algorithms presented here.
Researcher Affiliation	Academia	Yonathan Efroni jonathan.efroni@gmail.com Gal Dalal gald@campus.technion.ac.il Bruno Scherrer bruno.scherrer@inria.fr Shie Mannor shie@ee.technion.ac.il Department of Electrical Engineering, Technion, Israel Institute of Technology INRIA, Villers les Nancy, France
Pseudocode	Yes	Algorithm 1 Two-Timescale Online κ-Policy-Iteration, Algorithm 2 κ-API, Algorithm 3 κ-PSDP
Open Source Code	No	The paper does not contain any statement about making its source code available. The discussion section states: "Lastly, a next indisputable step would be to empirically evaluate implementations of the algorithms presented here."
Open Datasets	No	The paper is theoretical and does not use datasets. It defines an MDP framework but does not mention specific training data.
Dataset Splits	No	The paper is theoretical and does not conduct experiments with datasets, thus no dataset splits are discussed.
Hardware Specification	No	The paper is theoretical and does not report on experiments, thus no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not report on experiments, thus no software dependencies with version numbers are provided.
Experiment Setup	No	The paper is theoretical and does not report on experiments, thus no experimental setup details like hyperparameters or training settings are provided.