A Variant of Anderson Mixing with Minimal Memory Size

Authors: Fuchao Wei, Chenglong Bao, Yang Liu, Guangwen Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on logistic regression and network training problems validate the effectiveness of the proposed Min-AM.
Researcher Affiliation Academia 1Department of Computer Science and Technology, Tsinghua University, China 2Institute for AI Industry Research (AIR), Tsinghua University, China 3Yau Mathematical Sciences Center, Tsinghua University, China 4Yanqi Lake Beijing Institute of Mathematical Sciences and Applications
Pseudocode Yes The algorithm is shown in Algorithm 1.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See the supplemental material.
Open Datasets Yes We conducted the regularized logistic regression on the datasets madelon and a9a from LIBSVM [14]. ... We applied the restarted Min-AM to logistic regression and the stochastic Min-AM to train neural networks. ... Experiments on CIFAR. ... Experiments on Image Net.
Dataset Splits Yes We did not perform any explicit train/validation split for the CIFAR datasets, instead we use the default training/testing split provided by the datasets.
Hardware Specification Yes All experiments are conducted on 8 Nvidia V100 GPUs.
Software Dependencies Yes All experiments are implemented with PyTorch (version 1.10.0).
Experiment Setup Yes We train all models for 160 epochs with a batch size of 128. For CIFAR datasets, the learning rate initialized to 0.1 and decayed by 0.1 at 80 and 120 epochs.