A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Authors: Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we report a preliminary numerical validation of our theoretical findings. |
| Researcher Affiliation | Collaboration | Jean Tarbouriech Facebook AI Research Paris & Inria Lille jean.tarbouriech@gmail.com Matteo Pirotta Facebook AI Research Paris pirotta@fb.com Michal Valko Deep Mind Paris valkom@deepmind.com Alessandro Lazaric Facebook AI Research Paris lazaric@fb.com |
| Pseudocode | Yes | Algorithm 1 GOSPRL Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability. |
| Open Datasets | No | The paper mentions domains like 'River Swim domain' and 'Garnet environment' but does not provide concrete access information (link, DOI, or formal citation with author/year) for public availability. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers. |
| Experiment Setup | Yes | We consider a TREASURE-type problem (Sect. 4.1), where for all (s, a) we set b(s, a) = 10 instead of 1 (we call it the TREASURE-10 problem). |