Replicability in Reinforcement Learning

Authors: Amin Karbasi, Grigoris Velegkas, Lin Yang, Felix Zhou

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We initiate the mathematical study of replicability as an algorithmic property in the context of reinforcement learning (RL). Our work focuses solely on the setting with the generative model. We derive upper bounds on the sample complexity in both settings and validate their results experimentally. On the other hand, our work focuses solely on the setting with the generative model.
Researcher Affiliation Collaboration Amin Karbasi Yale University, Google Research amin.karbasi@yale.edu Grigoris Velegkas Yale University grigoris.velegkas@yale.edu Lin F. Yang UCLA linyang@ee.ucla.edu Felix Zhou Yale University felix.zhou@yale.edu
Pseudocode Yes Algorithm A.1 TV Indistinguishable Oracle for Multiple Query Estimation, Algorithm A.2 Sampling from Pairwise Optimal Coupling; [Angel and Spinka, 2019]
Open Source Code No The paper does not provide any statements about releasing source code for the methodology described, nor does it provide links to any code repositories for its own work. It mentions open-source code in the context of related work by other researchers.
Open Datasets No The paper is theoretical and does not use or reference any specific datasets for training or evaluation. It assumes access to a 'generative model' which is a theoretical construct for sampling, not a publicly available dataset.
Dataset Splits No The paper is theoretical and does not conduct experiments involving dataset splits. Therefore, it does not provide information on training, validation, or test splits.
Hardware Specification No The paper is theoretical and focuses on mathematical properties and algorithms; it does not describe any experimental setup or mention specific hardware specifications.
Software Dependencies No The paper is theoretical and does not provide details on specific software dependencies with version numbers required for implementation or reproduction.
Experiment Setup No The paper is theoretical and focuses on algorithm design and theoretical bounds, not practical experimental setups. Therefore, it does not provide details on hyperparameters or training settings.