Variance Reduction for Matrix Games

Authors: Yair Carmon, Yujia Jin, Aaron Sidford, Kevin Tian

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present a randomized primal-dual algorithm that solves the problem minx maxy y>Ax to additive error in time nnz(A) + nnz(A)n/ , for matrix A with larger dimension n and nnz(A) nonzero entries. This improves the best known exact gradient methods by a factor of nnz(A)/n and is faster than fully stochastic gradient methods in the accurate and/or sparse regime n/nnz(A). Our results hold for x, y in the simplex (matrix games, linear programming) and for x in an 2 ball and y in the simplex (perceptron / SVM, minimum enclosing ball). Our algorithm combines the Nemirovski s conceptual prox-method and a novel reduced-variance gradient estimator based on sampling from the difference between the current iterate and a reference point. The paper focuses on theoretical analysis, algorithm design, and complexity bounds rather than empirical studies or data analysis.
Researcher Affiliation Academia Yair Carmon, Yujia Jin, Aaron Sidford and Kevin Tian Stanford University {yairc,yujiajin,sidford,kjtian}@stanford.edu
Pseudocode Yes Algorithm 1: Outer Loop(O) (Nemirovski [28]) Input: ( , )-relaxed proximal oracle O(z) for gradient mapping g, distance-generating r Parameters:Number of iterations K Output: Point z K with E Gap( z) 1 z0 arg minz2Z r(z) 2 for k = 1, . . . , K do 3 zk 1/2 O(zk 1) . We implement O(zk 1) by calling Inner Loop(zk 1, gzk 1, ) 4 zk = Prox zk 1(g(zk 1/2)) = arg minz2Z 5 return z K = 1 K P K k=1 zk.
Open Source Code No The paper is theoretical, describing algorithms and their complexity. It does not mention any software implementations or provide links to open-source code for the described methodology.
Open Datasets No The paper is theoretical and focuses on algorithm design and complexity analysis for matrix games. It does not involve experimental evaluation on specific datasets, thus no dataset availability information is provided.
Dataset Splits No The paper is theoretical and does not involve experimental evaluation on datasets, so there is no mention of training, validation, or test data splits.
Hardware Specification No The paper is theoretical and does not report on any experiments that would require specific hardware for execution. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe experimental implementations, thus no specific software dependencies with version numbers are mentioned.
Experiment Setup No The paper is theoretical and focuses on algorithm design and complexity analysis. It does not describe any experimental setup details such as hyperparameters or system-level training settings.