On the Statistical Benefits of Temporal Difference Learning
Authors: David Cheikhi, Daniel Russo
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 3 displays the Mean Square Error (MSE) of the TD and MC estimates for these quantities when the dataset contains n = 2000 independent trajectories. MSE calculations involve 10000 Monte-Carlo replications. |
| Researcher Affiliation | Academia | David Cheikhi 1 Daniel Russo 1 1Columbia University. Correspondence to: David Cheikhi <d.cheikhi@columbia.edu>. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link for open-source code availability. |
| Open Datasets | No | The paper mentions using a 'batch of trajectories' and 'dataset' but does not provide concrete access information (link, DOI, repository, or formal citation) for a publicly available dataset. |
| Dataset Splits | No | The paper mentions 'training data' and a 'dataset' used for calculations (e.g., 'n = 2000 independent trajectories'), but it does not provide specific details on train/validation/test dataset splits or cross-validation setup. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9, CPLEX 12.4). |
| Experiment Setup | Yes | We consider a Layered MRP with width W = 5. We focus on state s(1) 1 and s(2) 1 and study the accuracy of the estimates of their value as we vary the horizon T of the MRP. ... the dataset contains n = 2000 independent trajectories. MSE calculations involve 10000 Monte-Carlo replications. |