reproducibilityindex.ai

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Authors: Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin Yang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we instantiate our framework on a class of hard exploration problems to demonstrate the practicality of our theory.
Researcher Affiliation	Academia	Fei Feng University of California, Los Angeles fei.feng@math.ucla.edu Ruosong Wang Carnegie Mellon University ruosongw@andrew.cmu.edu Wotao Yin University of California, Los Angeles wotaoyin@math.ucla.edu Simon S. Du University of Washington ssdu@cs.washington.edu Lin F. Yang University of California, Los Angeles linyang@ee.ucla.edu
Pseudocode	Yes	Algorithm 1 A Uniﬁed Framework for Unsupervised RL; Algorithm 2 Trajectory Sampling Routine TSR (ULO, π, B); Algorithm 3 Fix Label( f[H+1], Z)
Open Source Code	Yes	Our code is available at https://github.com/Florence Feng/State Decoding.
Open Datasets	No	We conduct experiments in two environments: Lock Bernoulli and Lock Gaussian. These environments are also studied in Du et al. (2019a), which are designed to be hard for exploration.
Dataset Splits	No	The paper describes custom-built environments (Lock Bernoulli and Lock Gaussian) for which data is generated episodically, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) or provide a method to reproduce such splits from a static dataset.
Hardware Specification	No	No specific hardware details such as GPU models, CPU models, or memory specifications used for running experiments are provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) are mentioned in the paper.
Experiment Setup	No	The paper states: 'Details about hyperparameters and unsupervised learning oracles in URL can be found in Appendix C.', thus deferring the specific experimental setup details to a supplemental appendix rather than providing them in the main text.