Effectiveness of Constant Stepsize in Markovian LSA and Statistical Inference

Authors: Dongyan (Lucy) Huo, Yudong Chen, Qiaomin Xie

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive numerical experiments and compare against classical inference approaches. Our results show that using a constant stepsize enjoys easy hyperparameter tuning, fast convergence, and consistently better CI coverage, especially when data is limited.
Researcher Affiliation Academia 1School of Operations Research and Information Engineering, Cornell University 2Department of Computer Sciences, University of Wisconsin-Madison 3Department of Industrial and Systems Engineering, University of Wisconsin-Madison
Pseudocode No No pseudocode or algorithm blocks are explicitly presented or labeled in the paper.
Open Source Code No No explicit statement or link providing access to open-source code for the described methodology.
Open Datasets No We consider LSA problems in dimension d = 5 for a finite state, irreducible, and aperiodic Markov chain with N = 10 states. We generate the transition probability P, and the functions A and b randomly; see the appendix for details.
Dataset Splits No The paper describes its data generation and processing steps (e.g., burn-in, batching) but does not specify standard training, validation, and test dataset splits in the context of model training.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the paper.
Software Dependencies No No specific software dependencies or their version numbers (e.g., Python, PyTorch, TensorFlow versions) are mentioned.
Experiment Setup Yes We consider LSA problems in dimension d = 5 for a finite state, irreducible, and aperiodic Markov chain with N = 10 states. We mainly study under constant stepsizes α = 0.2 and α = 0.02. The diminishing rate t 0.5 is chosen... The first b iterates (θ(α) t )b 1 t=0 are considered as initial burn-in and are not used in the inference procedure. For the remaining iterates, we divide them equally into K batches of size n. Within each batch, we discard the first n0( 0) iterates...