Deep Learning for Computing Convergence Rates of Markov Chains

Authors: Yanlin Qu, Jose Blanchet, Peter W Glynn

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We analyze the sample complexity of the algorithm and further demonstrate the effectiveness of the DCDC by generating convergence bounds for realistic Markov chains arising from stochastic processing networks as well as constant step-size stochastic optimization. 5 Numerical Examples
Researcher Affiliation Academia Yanlin Qu Jose Blanchet Peter Glynn Department of Management Science and Engineering Stanford University {quyanlin,jose.blanchet,glynn}@stanford.edu
Pseudocode Yes Algorithm 1 Deep Contractive Drift Calculator (DCDC)
Open Source Code Yes The code is available in the supplementary material.
Open Datasets No The paper describes generating synthetic data for its numerical examples (e.g., "we set m = 100 and uniformly generate 100 xi s" for SGD; "interarrival time T U[0, 0.2] and arriving amount Z U[0, 0.1]" for tandem network), but it does not specify any publicly available datasets with links, DOIs, or formal citations.
Dataset Splits No The paper describes how it generates data for its numerical examples and trains a neural network on these samples. However, it does not explicitly define or mention conventional training, validation, or test dataset splits for the Markov chains or the neural network training process in a way that would allow direct reproduction of data partitioning for these specific experiments.
Hardware Specification Yes Each training procedure in this paper was completed within ten minutes on an M2 Mac Book Air with 8GB RAM.
Software Dependencies No The paper mentions "Adam steps" for training the neural network but does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes For the SGD, we set the regularization parameter λ = 1, step-size α = 0.1, and batch-size β = 10. For DCDC, we run 1M Adam steps to train a single-layer network with width 1000 and sigmoid activation.