Variance-Reduced Stochastic Gradient Descent on Streaming Data

Authors: Ellango Jothimurugesan, Ashraf Tahmasbi, Phillip Gibbons, Srikanta Tirthapura

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical and experimental results show that the risk of STRSAGA is comparable to that of an offline algorithm on a variety of input arrival patterns, and its experimental performance is significantly better than prior algorithms suited for streaming data, such as SGD and SSVRG.
Researcher Affiliation Academia Ellango Jothimurugesan Carnegie Mellon University ejothimu@cs.cmu.edu Ashraf Tahmasbi Iowa State University tahmasbi@iastate.edu Phillip B. Gibbons Carnegie Mellon University gibbons@cs.cmu.edu Srikanta Tirthapura Iowa State University snt@iastate.edu
Pseudocode Yes Algorithm 1 depicts the steps taken to process the zero or more points Xi arriving at time step i.
Open Source Code No No explicit statement or link providing access to the open-source code for the methodology described in this paper was found.
Open Datasets Yes For logistic regression, we use the A9A [DKT17] and RCV1.binary [LYRL04] datasets, and for matrix factorization, we use two datasets of user-item ratings from Movielens [HK16]. More detail on the datasets are provided in the supplementary material.
Dataset Splits No The paper does not provide specific details on training, validation, and test dataset splits (e.g., percentages or sample counts) in the main text.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory) used for the experiments were mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers were mentioned in the paper.
Experiment Setup Yes In our experiments, the training data arrives over the course of 100 time steps, with skewed arrivals parameterized by M = 8λ. At each time step i, a streaming data algorithm has access to ρ gradient computations to update the model; we show results for ρ/λ = 1 and ρ/λ = 5.