Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms
Authors: Dheeraj Nagaraj, Xian Wu, Guy Bresler, Prateek Jain, Praneeth Netrapalli
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations confirm our analysis and indicates that our derived rates are tight. We also conducted experiments on data generated using Gaussian AR MC (6). We set (d, σ, ϵ) = (10, 1e 3, 0.01) and choose the buffer size B = 1/ϵ2. We report results averaged over 100 runs. Figure 1a compare the estimation error achieved by SGD, SGD-DD, and the proposed SGDER method. Note that, as expected by our theorems, the decay regime starts at d τmix for SGD-ER and dτmix for SGD which is similar to rate of SGD-DD. |
| Researcher Affiliation | Collaboration | Guy Bresler Massachusetts Institute of Technology Cambridge, USA 02139 guy@mit.edu Prateek Jain Microsoft Research Bengaluru, India 560001 prajain@microsoft.com Dheeraj Nagaraj Massachusetts Institute of Technology Cambridge, USA 02139 dheeraj@mit.edu Praneeth Netrapalli Microsoft Research Bengaluru, India 560001 praneeth@microsoft.com Xian Wu Stanford University Stanford, USA 94305 xwu20@stanford.edu |
| Pseudocode | Yes | Algorithm 1 SGD-DD; Algorithm 2 Parallel SGD; Algorithm 3 SGD with tail-averaging; Algorithm 4 SGD with Experience Replay (SGD-ER) |
| Open Source Code | No | The paper does not provide explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | We also conducted experiments on data generated using Gaussian AR MC (6). The paper uses simulated data, but does not provide any access information for it. |
| Dataset Splits | No | The paper describes conducting experiments with generated data but does not specify any formal train/validation/test dataset splits or their percentages/counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions, programming language versions, or solver versions) needed to replicate the experiments. |
| Experiment Setup | Yes | We set (d, σ, ϵ) = (10, 1e 3, 0.01) and choose the buffer size B = 1/ϵ2. We report results averaged over 100 runs. Require: (X1, Y1), . . . (XT , YT ) Rd sampled using (6), η: learning rate w N(0, 1), B 1 ϵ7 , u max( 2 ϵ2 log 300000πd B ϵ2 log 300000πd2σ6 ϵ2δ ), S B + u |