reproducibilityindex.ai

Stochastic Shortest Path with Adversarially Changing Costs

Authors: Aviv Rosenberg, Yishay Mansour

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper we present the adversarial SSP model that also accounts for adversarial changes in the costs over time, while the underlying transition function remains unchanged. Formally, an agent interacts with an SSP environment for K episodes, the cost function changes arbitrarily between episodes, and the transitions are unknown to the agent. We develop the ﬁrst algorithms for adversarial SSPs and prove high probability regret bounds of squareroot K assuming all costs are strictly positive, and sub-linear regret in the general case. First, we formalize the adversarial SSP model and deﬁne the notion of learning and regret. Second, we establish an efﬁcient implementation of OMD in the SSP model with known transitions and study the conditions under which it guarantees near-optimal K expected regret, showing that some modiﬁcations are necessary.
Researcher Affiliation	Collaboration	Aviv Rosenberg1 and Yishay Mansour1,2 1Tel Aviv University, Israel 2Google Research, Tel Aviv {avivros007,mansour.yishay}@gmail.com
Pseudocode	Yes	Pseudocode in Appendix C. We defer to the pseudocode in Appendix E. We defer to the full pseudocode in Appendix H
Open Source Code	No	The paper is theoretical and focuses on algorithm design and proofs of regret bounds. There is no statement or link indicating that the source code for the described methods is publicly available or will be released.
Open Datasets	No	This paper is theoretical, focusing on algorithm design and mathematical proofs, and therefore does not use or reference any datasets.
Dataset Splits	No	As a theoretical paper, it does not involve empirical experiments with data, and therefore does not discuss training, validation, or test dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any computational experiments or their setup, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design and proofs, without any mention of specific software or library dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not present empirical experiments, therefore it does not include details about an experimental setup, hyperparameters, or system-level training settings.