reproducibilityindex.ai

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Authors: Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to induce an optimistic SSP problem whose associated value iteration scheme is guaranteed to converge. We prove that EB-SSP achieves the minimax regret rate...
Researcher Affiliation	Collaboration	Jean Tarbouriech Facebook AI Research & Inria Lille jean.tarbouriech@gmail.com Runlong Zhou Tsinghua University zhourunlongvector@gmail.com Simon S. Du University of Washington & Facebook AI Research ssdu@cs.washington.edu Matteo Pirotta Facebook AI Research Paris pirotta@fb.com Michal Valko Deep Mind Paris valkom@deepmind.com Alessandro Lazaric Facebook AI Research Paris lazaric@fb.com
Pseudocode	Yes	Algorithm 1: Algorithm EB-SSP
Open Source Code	No	The paper does not contain any statements about releasing open-source code or provide a link to a code repository for the methodology described.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and proofs; it does not describe experiments involving datasets for training. Therefore, no information on publicly available datasets for training is provided.
Dataset Splits	No	The paper is theoretical and focuses on algorithm design and proofs; it does not describe empirical experiments that would involve training, validation, and test dataset splits.
Hardware Specification	No	The paper is theoretical, focusing on algorithm design and analysis, and does not describe any empirical experiments that would require specific hardware for execution. Therefore, no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design and proofs; it does not describe empirical experiments that would require specific software dependencies with version numbers for reproducibility.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and analysis, rather than describing an empirical experiment with specific setup details like hyperparameters or training configurations.