Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates
Authors: Alp Yurtsever, Alex Gu, Suvrit Sra
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our proposed methods with competing methods on various applications. ... In Section 6, we discuss empirical performance of our methods by comparing them against present established methods on various benchmark problems from COPT Library (Pedregosa et al., 2020) including the overlapping group lasso, total variation deblurring, and sparse and low-rank matrix recovery. We also test our methods on nonconvex optimization by training a neural network model. |
| Researcher Affiliation | Academia | Alp Yurtsever Umeå University alp.yurtsever@umu.se Alex Gu & Suvrit Sra Massachusetts Institute of Technology {gua,suvrit}@mit.edu |
| Pseudocode | Yes | Algorithm 1 Three Operator Splitting (TOS) Input: Initial point y0 Rn, step-size sequence {γt}T t=0 for t = 0, 1, 2, . . . , T do zt = proxγtg(yt) Choose an update direction ut Rn {ut = f(zt) captures the standard version of TOS} xt = proxγth(2zt yt γtut) yt+1 = yt zt + xt end for Return: Ergodic sequence xt and zt defined in (5) |
| Open Source Code | Yes | The source code for the experiments is available in the supplements. ... We implemented the proposed methods for the openopt/copt Python library. |
| Open Datasets | Yes | We use the benchmarks on synthetic data (dimensions n = 1002, N = 100) and real-sim dataset (Chang & Lin, 2011) (n = 20958, N = 72309). ... We reuse the open source implementation (built with Lasagne framework based on Theano) published in (Scardapane et al., 2017) under BSD-2 License. We follow their experimental setup and instructions with MNIST database (Le Cun, 1998) containing 70k grayscale images (28 × 28) of handwritten digits (split 75/25 into train and test partitions). |
| Dataset Splits | Yes | with MNIST database (Le Cun, 1998) containing 70k grayscale images (28 × 28) of handwritten digits (split 75/25 into train and test partitions). |
| Hardware Specification | Yes | Our experiments are performed in Python 3.7 with Intel Core i9-9820X CPU @ 3.30GHz. |
| Software Dependencies | Yes | Our experiments are performed in Python 3.7 with Intel Core i9-9820X CPU @ 3.30GHz. ... We reuse the open source implementation (built with Lasagne framework based on Theano) published in (Scardapane et al., 2017) under BSD-2 License. |
| Experiment Setup | Yes | For ADAPTOS, we discard β and tune α by trying the powers of 10. ... We train a fully connected neural network with 784 input features, three hidden layers (400/300/100) and 10-dimensional output layer. ... This experiment is performed with 20 random seeds. |