Zap Q-Learning
Authors: Adithya M Devraj, Sean Meyn
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments confirm the quick convergence, even in such non-ideal cases. Results from numerical experiments are surveyed here to illustrate the performance of the Zap Qlearning algorithm. (Section 3) |
| Researcher Affiliation | Academia | Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32608. adithyamdevraj@ufl.edu, meyn@ece.ufl.edu |
| Pseudocode | Yes | Algorithm 1 Zap Q(λ)-learning |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide any links to a code repository or mention code availability in supplementary materials. |
| Open Datasets | No | The paper mentions a "simple path-finding problem" and a "Finance model" taken from [27, 7]. While it refers to previous work, it does not provide concrete access information (specific link, DOI, repository name, or explicit statement of public availability with author/year attribution within this paper) for these datasets. |
| Dataset Splits | No | The paper describes numerical experiments but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | A special case is considered in the analysis here: the basis is chosen as in Watkins algorithm, λ = 0, and αn 1/n. and αn = n −1 , γn = n ρ , n 1 , for some fixed ρ ( 1/2 , 1). and Experiments using β = 0.8 and g = 70 resulted in values... and with gain g = 70, and the Zap algorithm, γn α0.85 n . These mention specific values for parameters used in the experiments. |