Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality

Authors: Audrey Huang, Nan Jiang

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	For the first time, we demonstrate how policy optimization can be conducted with (only) occupancy functions for both online and ofﬂine RL, and comprehensively analyze both local and global convergence. ... As our work is the first in this line of research and theoretical in nature, for future work we plan to launch empirical investigations of our methods, especially those for optimizing general functionals.
Researcher Affiliation	Academia	Audrey Huang Department of Computer Science University of Illinois Urbana-Champaign Champaign, IL 61820 EMAIL Nan Jiang Department of Computer Science University of Illinois Urbana-Champaign Champaign, IL 61820 EMAIL
Pseudocode	Yes	Algorithm 1 OCCUPG: Online Occupancy-based Policy Gradient; Algorithm 2 OFF-OCCUPG: Ofﬂine Occupancy-based Policy Gradient; Algorithm 3 Online Occupancy-based PG for General Functionals; Algorithm 4 Maximum Likelihood Estimation; Algorithm 5 Fitted Occupancy Iteration with Smooth Clipping
Open Source Code	No	The NeurIPS Paper Checklist states: "The answer NA means that paper does not include experiments requiring code. ... We do not believe that Figure 1 constitutes as an experiment that requires code."
Open Datasets	No	The NeurIPS Paper Checklist explicitly states: "The answer NA means that the paper does not include experiments." The paper is theoretical and does not involve training models on datasets.
Dataset Splits	No	The NeurIPS Paper Checklist explicitly states: "The answer NA means that the paper does not include experiments." The paper is theoretical and does not describe experimental validation splits.
Hardware Specification	No	The NeurIPS Paper Checklist explicitly states: "The answer NA means that the paper does not include experiments." The paper is theoretical and does not mention any specific hardware used for computations or experiments.
Software Dependencies	No	The NeurIPS Paper Checklist explicitly states: "The answer NA means that the paper does not include experiments." The paper is theoretical and does not list any specific software dependencies with version numbers for experimental reproducibility.
Experiment Setup	No	The NeurIPS Paper Checklist explicitly states: "The answer NA means that the paper does not include experiments." The paper is theoretical and does not provide an experimental setup section with hyperparameters or training details.