Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Convergence of Continuous Single-timescale Actor-critic

Authors: Xuyang Chen, Lin Zhao

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We establish finite-time convergence by introducing a novel Lyapunov analysis framework, which provides a unified convergence characterization of both the actor and the critic. Our approach is less conservative than previous methods and offers new insights into the coupled dynamics of actor-critic updates. ... In this paper, we provide a finite-time convergence analysis for the single-sample, single-timescale actor-critic algorithm in continuous state-action spaces. We propose a novel Lyapunov analysis framework, which allows a less conservative analysis under the same set of assumptions adopted in existing studies.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, National University of Singapore, Singapore. Correspondence to: Lin Zhao <EMAIL>.
Pseudocode Yes Algorithm 1 Continuous Single-sample Single-timescale Actor-Critic with Markovian Sampling
Open Source Code No The paper does not contain any explicit statement or link indicating the release of open-source code for the described methodology.
Open Datasets No The paper is theoretical, focusing on convergence analysis of actor-critic methods in continuous state-action spaces. It does not mention or utilize any specific datasets for empirical evaluation, thus no information about open datasets is provided.
Dataset Splits No The paper is theoretical and does not involve experiments on specific datasets, therefore, there are no mentions of dataset splits (e.g., training/test/validation splits).
Hardware Specification No The paper is a theoretical work on convergence analysis and does not describe any experimental setup or the hardware used to run experiments.
Software Dependencies No The paper is theoretical and does not describe any specific software dependencies or versions used for implementation.
Experiment Setup No The paper focuses on theoretical convergence analysis of an algorithm and does not provide details of an experimental setup, such as hyperparameters or training configurations.