reproducibilityindex.ai

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Authors: Pan Xu, Quanquan Gu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we present a finite-time analysis of a neural Qlearning algorithm, where the data are generated from a Markov decision process, and the actionvalue function is approximated by a deep Re LU neural network. We prove that neural Q-learning finds the optimal policy with O(1/ T) convergence rate if the neural function approximator is sufficiently overparameterized, where T is the number of iterations.
Researcher Affiliation	Academia	Pan Xu 1 Quanquan Gu 1 Department of Computer Science, University of California, Los Angeles. Correspondence to: Quanquan Gu <qgu@cs.ucla.edu>.
Pseudocode	Yes	Algorithm 1 Neural Q-Learning with Gaussian Initialization
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	No	The paper is theoretical and focuses on analysis of an algorithm where "data are generated from a Markov decision process," but it does not specify or provide access information for a particular public dataset for training or evaluation.
Dataset Splits	No	This is a theoretical paper and does not describe empirical experiments with data splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware used for running experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical, providing an algorithm (Algorithm 1) and its convergence analysis, but it does not detail a specific experimental setup with concrete hyperparameter values or training configurations for empirical reproduction.