Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
Authors: Zeke Xie, Li Yuan, Zhanxing Zhu, Masashi Sugiyama
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide extensive experimental results to verify that PNM can indeed make significant improvements over conventional Momentum, shown in Table 1. |
| Researcher Affiliation | Academia | 1The University of Tokyo 2National University of Singapore 3Beijing Institute of Big Data Research 4RIKEN Center for AIP. |
| Pseudocode | Yes | Algorithm 1 (Stochastic) Heavy Ball/Momentum; Algorithm 2 (Stochastic) PNM; Algorithm 3 Ada PNM |
| Open Source Code | Yes | Code: https://github.com/zeke-xie/ Positive-Negative-Momentum. |
| Open Datasets | Yes | CIFAR-10/CIFAR-100 (Krizhevsky & Hinton, 2009), Image Net (Deng et al., 2009) and Penn Tree Bank (Marcus et al., 1993). |
| Dataset Splits | No | The paper refers to using standard datasets like CIFAR-10, CIFAR-100, ImageNet, and Penn Tree Bank, but does not explicitly state the training, validation, and test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide specific version numbers for any software dependencies like libraries, frameworks, or languages. |
| Experiment Setup | Yes | In our paper, we choose γ = 5 as the default setting, which corresponds to β0 = 1. We leave the implementation details in Appendix B. |