Towards Understanding the Data Dependency of Mixup-style Training
Authors: Muthu Chidambaram, Xiang Wang, Yuzheng Hu, Chenwei Wu, Rong Ge
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To verify that the theory predicts the experiments, we train a two-layer feedforward neural network with 512 hidden units and Re LU activations on X 2 3 with and without Mixup. |
| Researcher Affiliation | Academia | Muthu Chidambaram1, Xiang Wang1, Yuzheng Hu2, Chenwei Wu1, and Rong Ge1 1Duke University, 2University of Illinois at Urbana-Champaign |
| Pseudocode | No | The paper describes mathematical derivations and experimental procedures in prose, but it does not include any clearly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | All of the code used to generate the plots and experimental results in this paper can be found at: https://github.com/2014mchidamb/Mixup-Data-Dependency. |
| Open Datasets | Yes | We validate this by training Res Net-18 (He et al., 2015) (using the popular implementation of Kuang Liu) on MNIST (Le Cun, 1998), CIFAR-10, and CIFAR-100 (Krizhevsky, 2009) with and without Mixup (...) we consider the two moons dataset (Buitinck et al., 2013). |
| Dataset Splits | No | We validate this by training Res Net-18 (...) on MNIST (...), CIFAR-10, and CIFAR-100 (...) for 50 epochs with a batch size of 128 (...). The paper specifies training parameters and datasets, but does not provide specific percentages or counts for training, validation, and test splits, nor does it explicitly mention a distinct validation set split with details. |
| Hardware Specification | No | The paper trains neural networks and conducts experiments but does not specify any hardware details such as specific GPU/CPU models, memory configurations, or cloud computing instance types used. |
| Software Dependencies | No | Our implementation uses Py Torch (Paszke et al., 2019) and is based heavily on the open source implementation of Manifold Mixup (Verma et al., 2019) by Shivam Saboo. (...) training using (full-batch) Adam (Kingma & Ba, 2015). The paper mentions software like PyTorch and Adam and provides citations, but it does not explicitly state specific version numbers for these software dependencies. |
| Experiment Setup | Yes | Results for training using (full-batch) Adam (Kingma & Ba, 2015) with the suggested (and common) hyperparameters of β1 = 0.9, β2 = 0.999 and a learning rate of 0.001 are shown in Figure 1. (...) We validate this by training Res Net-18 (...) for 50 epochs with a batch size of 128 and otherwise identical settings to the previous subsection. |