Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

Authors: Zeke Xie, Li Yuan, Zhanxing Zhu, Masashi Sugiyama

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide extensive experimental results to verify that PNM can indeed make significant improvements over conventional Momentum, shown in Table 1.
Researcher Affiliation Academia 1The University of Tokyo 2National University of Singapore 3Beijing Institute of Big Data Research 4RIKEN Center for AIP.
Pseudocode Yes Algorithm 1 (Stochastic) Heavy Ball/Momentum; Algorithm 2 (Stochastic) PNM; Algorithm 3 Ada PNM
Open Source Code Yes Code: https://github.com/zeke-xie/ Positive-Negative-Momentum.
Open Datasets Yes CIFAR-10/CIFAR-100 (Krizhevsky & Hinton, 2009), Image Net (Deng et al., 2009) and Penn Tree Bank (Marcus et al., 1993).
Dataset Splits No The paper refers to using standard datasets like CIFAR-10, CIFAR-100, ImageNet, and Penn Tree Bank, but does not explicitly state the training, validation, and test splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or cloud instance specifications.
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for any software dependencies like libraries, frameworks, or languages.
Experiment Setup Yes In our paper, we choose γ = 5 as the default setting, which corresponds to β0 = 1. We leave the implementation details in Appendix B.