Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Authors: Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to induce an optimistic SSP problem whose associated value iteration scheme is guaranteed to converge. We prove that EB-SSP achieves the minimax regret rate... |
| Researcher Affiliation | Collaboration | Jean Tarbouriech Facebook AI Research & Inria Lille EMAIL Runlong Zhou Tsinghua University EMAIL Simon S. Du University of Washington & Facebook AI Research EMAIL Matteo Pirotta Facebook AI Research Paris EMAIL Michal Valko Deep Mind Paris EMAIL Alessandro Lazaric Facebook AI Research Paris EMAIL |
| Pseudocode | Yes | Algorithm 1: Algorithm EB-SSP |
| Open Source Code | No | The paper does not contain any statements about releasing open-source code or provide a link to a code repository for the methodology described. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and proofs; it does not describe experiments involving datasets for training. Therefore, no information on publicly available datasets for training is provided. |
| Dataset Splits | No | The paper is theoretical and focuses on algorithm design and proofs; it does not describe empirical experiments that would involve training, validation, and test dataset splits. |
| Hardware Specification | No | The paper is theoretical, focusing on algorithm design and analysis, and does not describe any empirical experiments that would require specific hardware for execution. Therefore, no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and proofs; it does not describe empirical experiments that would require specific software dependencies with version numbers for reproducibility. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and analysis, rather than describing an empirical experiment with specific setup details like hyperparameters or training configurations. |