Adapting to Mixing Time in Stochastic Optimization with Markovian Data

Authors: Ron Dorfman, Kfir Yehuda Levy

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate the performance of our algorithm on a synthetic linear regression task. and 6. Experiments We demonstrate the performance of our approach on a synthetic linear regression problem with Markovian data.
Researcher Affiliation Academia 1Viterby Faculty of Electrical and Computer Engineering, Technion, Haifa, Israel 2A Viterbi Fellow.
Pseudocode Yes Algorithm 1 MAG (MLMC Ada Grad)
Open Source Code No The paper does not provide any explicit statement about making the source code available or a link to a code repository.
Open Datasets No We consider a symmetric 2-state Markov chain, for which the probability of transitioning between states is p (0, 1). For each state i {1, 2}, we randomly choose w i Rd and draw Xi Rn d according to N(0, I). Then, we set yi = Xiw i + ϵ, where ϵ is a random noise vector N(0, 10 3).
Dataset Splits No The paper describes generating a synthetic dataset for experiments but does not provide explicit details about train/validation/test splits, as it appears to be a continuous simulation rather than a partitioned dataset.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes In practice, we observed that the multiplicative 2J factor in the MLMC estimator resulted in high magnitude gradients which in turn significantly reduced the learning rate of Ada Grad and slowed down future progress. Therefore, instead of drawing J Geom( 1 2), we used a truncated geometric distribution such that P(J = j) 2 j, j [K] for some fixed K. We used K = 5, which was not optimally tuned. In Figure 1 we present the performance of the methods for n = 250, d = 100 and p = 10 4.