Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Authors: Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we derive ๏ฌnite-sample bounds for any general off-policy TD-like stochastic approximation algorithm that solves for the ๏ฌxed-point of this generalized Bellman operator. Our key step is to show that the generalized Bellman operator is simultaneously a contraction mapping with respect to a weighted โp-norm for each p in [1, ), with a common contraction factor. ... Did you run experiments? [N/A] |
| Researcher Affiliation | Collaboration | Zaiwei Chen Georgia Institute of Technology Siva Theja Maguluri Georgia Institute of Technology Sanjay Shakkottai The University of Texas at Austin Karthikeyan Shanmugam IBM Research NY |
| Pseudocode | Yes | Algorithm 1 A Generic Algorithm for Multi-Step Off-Policy TD-Learning |
| Open Source Code | No | The paper states in its checklist: 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]' and does not provide any links or statements about open-sourcing code. |
| Open Datasets | No | The paper is theoretical and does not describe experiments involving datasets. The checklist includes: 'If you are using existing assets, did you cite the creators? [N/A]' |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation or dataset splits. The checklist includes: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]' |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. The checklist states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]' |
| Software Dependencies | No | The paper is theoretical and does not describe any specific software dependencies with version numbers for experiments. The checklist states: 'Did you include the code, data, and instructions needed to reproduce the main experimental results? [N/A]' |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup or hyperparameters. The checklist states: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]' |