Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Convergence of Continuous Single-timescale Actor-critic

Authors: Xuyang Chen, Lin Zhao

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We establish finite-time convergence by introducing a novel Lyapunov analysis framework, which provides a unified convergence characterization of both the actor and the critic. Our approach is less conservative than previous methods and offers new insights into the coupled dynamics of actor-critic updates. ... In this paper, we provide a finite-time convergence analysis for the single-sample, single-timescale actor-critic algorithm in continuous state-action spaces. We propose a novel Lyapunov analysis framework, which allows a less conservative analysis under the same set of assumptions adopted in existing studies.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, National University of Singapore, Singapore. Correspondence to: Lin Zhao <EMAIL>.
Pseudocode	Yes	Algorithm 1 Continuous Single-sample Single-timescale Actor-Critic with Markovian Sampling
Open Source Code	No	The paper does not contain any explicit statement or link indicating the release of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical, focusing on convergence analysis of actor-critic methods in continuous state-action spaces. It does not mention or utilize any specific datasets for empirical evaluation, thus no information about open datasets is provided.
Dataset Splits	No	The paper is theoretical and does not involve experiments on specific datasets, therefore, there are no mentions of dataset splits (e.g., training/test/validation splits).
Hardware Specification	No	The paper is a theoretical work on convergence analysis and does not describe any experimental setup or the hardware used to run experiments.
Software Dependencies	No	The paper is theoretical and does not describe any specific software dependencies or versions used for implementation.
Experiment Setup	No	The paper focuses on theoretical convergence analysis of an algorithm and does not provide details of an experimental setup, such as hyperparameters or training configurations.