Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On the Convergence of Smooth Regularized Approximate Value Iteration Schemes
Authors: Elena Smirnova, Elvis Dohmatob
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical illustration We numerically confirm the implications of the smoothing technique on the convergence, provided by Proposition 4. We run experiments on a toy stochastic gridworld problem with the evaluation step error due to the sampling of state-transitions. We plot the performance loss over 30 runs with varying values of smoothing factor β. As can be seen from Figure 1, smaller values of β result in tighter confidence intervals, but slower convergence speed. Figure 1: Performance loss computed over 30 runs of the smooth AMPI (8) with sampling of environment transitions under varying smoothing degree β. |
| Researcher Affiliation | Collaboration | Elena Smirnova EMAIL Elvis Dohmatob Criteo AI Lab EMAIL |
| Pseudocode | No | The paper describes various algorithmic schemes mathematically (e.g., (MPI), (AMPI), (smooth AMPI)), but does not provide any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement or link providing access to open-source code for the described methodology. |
| Open Datasets | No | The paper mentions running experiments on a 'toy stochastic gridworld problem', but does not provide any concrete access information (link, citation, or repository) for this or any other public dataset. |
| Dataset Splits | No | The paper mentions a 'toy stochastic gridworld problem' but does not provide specific details on training, validation, or test dataset splits. It only refers to 'sampling of environment transitions'. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud resources). |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers, needed to replicate the experiments. |
| Experiment Setup | No | The paper mentions 'varying values of smoothing factor β' for its numerical illustration but does not provide concrete hyperparameter values or detailed system-level training settings in the main text. |