Geometric Insights into the Convergence of Nonlinear TD Learning
Authors: David Brandfonbrener, Joan Bruna
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a small set of experiments on the divergent spiral example from Section 2.3 which support our conclusions about reversibility and k-step returns. |
| Researcher Affiliation | Academia | David Brandfonbrener Courant Institute of Mathematical Sciences New York University david.brandfonbrener@nyu.edu Joan Bruna Courant Institute of Mathematical Sciences Center for Data Science New York University bruna@cims.nyu.edu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. It primarily presents mathematical derivations and proofs. |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for the methodology is openly available. |
| Open Datasets | No | The paper does not explicitly mention the use of a publicly available or open dataset for training. It defines Markov Reward Processes and discusses theoretical models rather than specific datasets. |
| Dataset Splits | No | The paper discusses theoretical models and numerical experiments but does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, or testing. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. It only mentions 'numerical experiment' without specifying CPU, GPU, or other hardware details. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. It is a theoretical paper with a small numerical experiment, not typically detailing software stack. |
| Experiment Setup | No | The numerical experiment section describes how the environment was set up (e.g., adding reverse connections, increasing k), but it does not specify general experimental setup details common in machine learning papers, such as hyperparameters (learning rate, batch size, epochs, optimizer settings) or system-level training configurations. |