Understanding the Effect of Stochasticity in Policy Optimization

Authors: Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A] (c) Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? [N/A] (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]
Researcher Affiliation Collaboration Jincheng Mei 1 3 Bo Dai 3 Chenjun Xiao 1 3 Csaba Szepesvári 2 1 Dale Schuurmans 3 1 1University of Alberta 2Deep Mind 3Google Research, Brain Team equal advising
Pseudocode No The paper defines update rules (e.g., Update 1, Update 2) mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code No 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]
Open Datasets No The paper is theoretical and does not involve empirical studies with datasets, as indicated by the N/A responses in the 'If you ran experiments' section of the self-assessment.
Dataset Splits No The paper is theoretical and does not involve empirical studies with dataset splits, as indicated by the N/A responses in the 'If you ran experiments' section of the self-assessment.
Hardware Specification No The paper is theoretical and does not describe any experimental hardware specifications, as indicated by the N/A responses in the 'If you ran experiments' section of the self-assessment.
Software Dependencies No The paper is theoretical and does not specify software dependencies with version numbers, as indicated by the N/A responses in the 'If you ran experiments' section of the self-assessment.
Experiment Setup No The paper is theoretical and does not describe specific experimental setup details such as hyperparameters or training configurations, as indicated by the N/A responses in the 'If you ran experiments' section of the self-assessment.