Effectiveness of Constant Stepsize in Markovian LSA and Statistical Inference
Authors: Dongyan (Lucy) Huo, Yudong Chen, Qiaomin Xie
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive numerical experiments and compare against classical inference approaches. Our results show that using a constant stepsize enjoys easy hyperparameter tuning, fast convergence, and consistently better CI coverage, especially when data is limited. |
| Researcher Affiliation | Academia | 1School of Operations Research and Information Engineering, Cornell University 2Department of Computer Sciences, University of Wisconsin-Madison 3Department of Industrial and Systems Engineering, University of Wisconsin-Madison |
| Pseudocode | No | No pseudocode or algorithm blocks are explicitly presented or labeled in the paper. |
| Open Source Code | No | No explicit statement or link providing access to open-source code for the described methodology. |
| Open Datasets | No | We consider LSA problems in dimension d = 5 for a finite state, irreducible, and aperiodic Markov chain with N = 10 states. We generate the transition probability P, and the functions A and b randomly; see the appendix for details. |
| Dataset Splits | No | The paper describes its data generation and processing steps (e.g., burn-in, batching) but does not specify standard training, validation, and test dataset splits in the context of model training. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the paper. |
| Software Dependencies | No | No specific software dependencies or their version numbers (e.g., Python, PyTorch, TensorFlow versions) are mentioned. |
| Experiment Setup | Yes | We consider LSA problems in dimension d = 5 for a finite state, irreducible, and aperiodic Markov chain with N = 10 states. We mainly study under constant stepsizes α = 0.2 and α = 0.02. The diminishing rate t 0.5 is chosen... The first b iterates (θ(α) t )b 1 t=0 are considered as initial burn-in and are not used in the inference procedure. For the remaining iterates, we divide them equally into K batches of size n. Within each batch, we discard the first n0( 0) iterates... |