Variance Reduction With Sparse Gradients
Authors: Melih Elibol, Lihua Lei, Michael I. Jordan
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our algorithm consistently outperforms Spider Boost using various models on various tasks including image classification, natural language processing, and sparse matrix factorization. |
| Researcher Affiliation | Academia | Melih Elibol, Michael I. Jordan University of California, Berkeley {elibol,jordan}@cs.berkeley.edu Lihua Lei Stanford University lihualei@stanford.edu |
| Pseudocode | Yes | Algorithm 1: Spider Boost with Sparse Gradients. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing open-source code or provide a link to a code repository. |
| Open Datasets | Yes | For datasets, we use CIFAR-10 (Krizhevsky et al.), SVHN (Netzer et al., 2011), and MNIST (Le Cun & Cortes, 2010). |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test dataset splits, such as specific percentages or sample counts. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like Tensorflow and Pytorch but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For all experiments, unless otherwise specified, we run Spider Boost and Sparse Spider Boost with a learning rate η = 0.1, large-batch size B = 1000, small-batch size b = 100, inner loop length of m = 10, memory decay factor of α = 0.5, and k1 and k2 both set to 5% of the total number of model parameters. |