Online Stochastic Shortest Path with Bandit Feedback and Unknown Transition Function

Authors: Aviv Rosenberg, Yishay Mansour

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The algorithms are fairly simple, and the main challenge is the analysis of the regret and computational complexity.
Researcher Affiliation Collaboration Aviv Rosenberg Tel Aviv University, Israel avivros007@gmail.com Yishay Mansour Tel Aviv University, Israel and Google Research, Israel mansour.yishay@gmail.com
Pseudocode Yes The efficient implementation of this algorithm is similar to the one of the original UC-O-REPS algorithm, and is described in details in the supplementary material (together with full pseudo-code).
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It mentions pseudocode in supplementary material, but not executable source code.
Open Datasets No The paper is theoretical and does not conduct experiments involving datasets.
Dataset Splits No The paper is theoretical and does not conduct experiments, therefore no dataset splits are described.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not provide specific ancillary software details with version numbers.
Experiment Setup No The paper is theoretical and does not provide specific experimental setup details, hyperparameters, or training configurations.