reproducibilityindex.ai

Concurrent PAC RL

Authors: Zhaohan Guo, Emma Brunskill

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our preliminary experiments conﬁrm this result and show empirical beneﬁts. We also provide small simulation experiments that support our theoretical results and demonstrate the advantage of carefully sharing information during concurrent reinforcement learning.
Researcher Affiliation	Academia	Zhaohan Guo and Emma Brunskill Carnegie Mellon University 5000 Forbes Ave. Pittsburgh PA, 15213 United States
Pseudocode	Yes	Algorithm 1 PAC-EXPLORE
Open Source Code	No	No explicit statement or link for open-source code release for the described methodology was found.
Open Datasets	No	We use a 3x3 gridworld (Figure 1(a)).
Dataset Splits	No	Each run was for 10000 time steps and all experiments were averaged over 100 runs.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or memory) used for running experiments were mentioned.
Software Dependencies	No	The paper mentions algorithms like MBIE and PAC-EXPLORE but does not list any specific software dependencies with version numbers (e.g., programming languages, libraries, or solvers).
Experiment Setup	Yes	We tuned the conﬁdence interval parameters to maximize the cumulative reward for acting in a single task, and then used the same settings for all concurrent RL scenarios. We set m = , which essentially corresponds to always continuing to improve and reﬁne the parameter estimates (ﬁxing them after a certain number of experiences is important for the theoretical results but empirically it is often best to use all available experience). The PAC-EXPLORE algorithm was optimized with me = 1 and T = 4, and ﬁxed for all runs.