reproducibilityindex.ai

On the Convergence of Smooth Regularized Approximate Value Iteration Schemes

Authors: Elena Smirnova, Elvis Dohmatob

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical illustration We numerically conﬁrm the implications of the smoothing technique on the convergence, provided by Proposition 4. We run experiments on a toy stochastic gridworld problem with the evaluation step error due to the sampling of state-transitions. We plot the performance loss over 30 runs with varying values of smoothing factor β. As can be seen from Figure 1, smaller values of β result in tighter conﬁdence intervals, but slower convergence speed. Figure 1: Performance loss computed over 30 runs of the smooth AMPI (8) with sampling of environment transitions under varying smoothing degree β.
Researcher Affiliation	Collaboration	Elena Smirnova esmirnovae@gmail.com Elvis Dohmatob Criteo AI Lab e.dohmatob@criteo.com
Pseudocode	No	The paper describes various algorithmic schemes mathematically (e.g., (MPI), (AMPI), (smooth AMPI)), but does not provide any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not contain any statement or link providing access to open-source code for the described methodology.
Open Datasets	No	The paper mentions running experiments on a 'toy stochastic gridworld problem', but does not provide any concrete access information (link, citation, or repository) for this or any other public dataset.
Dataset Splits	No	The paper mentions a 'toy stochastic gridworld problem' but does not provide specific details on training, validation, or test dataset splits. It only refers to 'sampling of environment transitions'.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud resources).
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers, needed to replicate the experiments.
Experiment Setup	No	The paper mentions 'varying values of smoothing factor β' for its numerical illustration but does not provide concrete hyperparameter values or detailed system-level training settings in the main text.