reproducibilityindex.ai

Manifold Mixup: Better Representations by Interpolating Hidden States

Authors: Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, Yoshua Bengio

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Throughout a wide variety of experiments, we demonstrate four substantial beneﬁts of Manifold Mixup: Better generalization than other competitive regularizers (such as Cutout, Mixup, Ada Mix, and Dropout) (Section 5.1). Improved log-likelihood on test samples (Section 5.1). Increased performance at predicting data subject to novel deformations (Section 5.2). Improved robustness to single-step adversarial attacks. This is evidence Manifold Mixup pushes the decision boundary away from the data in some directions (Section 5.3).
Researcher Affiliation	Collaboration	1Aalto University, Finland 2Montréal Institute for Learning Algorithms (MILA) 3Sharif University of Technology 4Facebook Research
Pseudocode	No	The paper describes the steps of Manifold Mixup in narrative form within Section 2, but does not include a formally labeled pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any explicit statement about making source code available or links to a code repository.
Open Datasets	Yes	We show results for the CIFAR-10 (Table 1a), CIFAR-100 (Table 1b), SVHN (Table 2), and Tiny Image NET (Table 3) datasets.
Dataset Splits	No	For each regularizer, we selected the best hyper-parameters using a validation set. While a validation set is mentioned, no specific details about its size, split percentage, or how it was created for reproducibility are provided in the main text.
Hardware Specification	No	The acknowledgements section mentions 'Compute Canada for providing computing resources used in this work', but no specific hardware details such as GPU models, CPU types, or memory specifications are provided for the experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) are mentioned in the paper.
Experiment Setup	Yes	We follow the training procedure of (Zhang et al., 2018), which is to use SGD with momentum, a weight decay of 10 4, and a step-wise learning rate decay. Please refer to Appendix C for further details (including the values of the hyperparameter α).