Learning towards Minimum Hyperspherical Energy

Authors: Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, Zhiding Yu, Bo Dai, Le Song

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of our intuition, by showing the superior performance with MHE regularization.
Researcher Affiliation Collaboration 1Georgia Institute of Technology 2Emory University 3South China University of Technology 4NVIDIA 5Google Brain 6Ant Financial
Pseudocode No The paper does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes For all the experiments on CIFAR-10 and CIFAR-100 in the paper, we use moderate data augmentation, following [14, 27]. For Image Net-2012, we follow the same data augmentation in [30]. We train our network with the publicly available CASIA-Webface dataset [60], and then test the learned model on LFW and Mega Face dataset.
Dataset Splits No The paper mentions using standard datasets like CIFAR-10, CIFAR-100, ImageNet, MNIST, CASIA-Webface, LFW, and Mega Face. While these datasets typically have predefined splits, the paper does not explicitly provide the specific percentages or counts for training, validation, and testing splits within its text.
Hardware Specification Yes We would like to thank NVIDIA corporation for donating Titan Xp GPUs to support our research.
Software Dependencies No The paper mentions software components implicitly through methods used (e.g., SGD, BN, ReLU) and references to other papers, but it does not specify exact version numbers for programming languages, libraries, or frameworks (e.g., 'Python 3.x', 'PyTorch 1.x').
Experiment Setup Yes For all the experiments on CIFAR-10 and CIFAR-100 in the paper, we use moderate data augmentation, following [14, 27]. For Image Net-2012, we follow the same data augmentation in [30]. We train all the networks using SGD with momentum 0.9, and the network initialization follows [13]. All the networks use BN [20] and Re LU if not otherwise specified. Experimental details are given in each subsection and Appendix A.