First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Authors: Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.
Researcher Affiliation Collaboration Aleksandr Beznosikov Innopolis University, Skoltech, MIPT, Yandex Sergey Samsonov HSE University Marina Sheshukova HSE University Alexander Gasnikov MIPT, Skoltech, IITP RAS Alexey Naumov HSE University Eric Moulines Ecole polytechnique
Pseudocode Yes Algorithm 1 Randomized Accelerated GD; Algorithm 2 Randomized GD; Algorithm 3 Randomized Extra Gradient
Open Source Code No The paper does not provide any explicit statements or links indicating that source code for the described methodology is available.
Open Datasets No This is a theoretical paper focused on deriving complexity bounds and algorithmic analysis, not on empirical evaluation using specific datasets.
Dataset Splits No This is a theoretical paper focused on deriving complexity bounds and algorithmic analysis, not on empirical evaluation using specific datasets. Therefore, it does not provide details on dataset splits for training, validation, or testing.
Hardware Specification No This is a theoretical paper and does not describe empirical experiments, therefore no hardware specifications are provided.
Software Dependencies No This is a theoretical paper and does not describe empirical experiments, therefore no specific software dependencies with version numbers are provided.
Experiment Setup No This is a theoretical paper and does not describe empirical experiments, therefore no specific experimental setup details like hyperparameter values or training configurations are provided.