reproducibilityindex.ai

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Authors: Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments are conducted on both natural language understanding tasks with token-level data augmentation, and image classiﬁcation tasks with commonly-used image augmentation techniques like random crop and horizontal ﬂip. Empirical results show that the proposed method improves the generalization performance of the model.
Researcher Affiliation	Collaboration	Mingyang Yi1,2 , Lu Hou3, Lifeng Shang3, Xin Jiang3, Qun Liu3, Zhi-Ming Ma1,2 1University of Chinese Academy of Sciences yimingyang17@mails.ucas.edu.cn 2Academy of Mathematics and Systems Science, Chinese Academy of Sciences mazm@amt.ac.cn 3Huawei Noah s Ark Lab {houlu3,shang.lifeng,Jiang.Xin,qun.liu}@huawei.com
Pseudocode	Yes	Algorithm 1 Minimize the Maximal Expected Loss (MMEL). Algorithm 2 Augmented Sample Generation by Greedy Search.
Open Source Code	No	The paper does not contain any statements about making source code publicly available or provide links to a code repository.
Open Datasets	Yes	Data. CIFAR (Krizhevsky et al., 2014) is a benchmark dataset for image classiﬁcation. We use both CIFAR-10 and CIFAR-100 in our experiments... Data. Image Net(Deng et al., 2009)... Data. GLUE is a benchmark containing various natural language understanding tasks... (Wang et al., 2019).
Dataset Splits	Yes	We use both CIFAR-10 and CIFAR-100 in our experiments, which are colorful images with 50000 training samples and 10000 validation samples, but from 10 and 100 object classes, respectively.
Hardware Specification	Yes	The time is the training time measured on a single NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions using Adam W optimizer and BERT model, but it does not specify version numbers for these or any other software components or libraries.
Experiment Setup	Yes	Setup. The model we used is Res Net (He et al., 2016) with different depths... We use the SGD with momentum optimizer to train each model for 200 epochs. The learning rate starts from 0.1 and decays by a factor of 0.2 at epochs 60, 120 and 160. The batch size is 128, and weight decay is 5e-4. For each xi, \|B(xi)\| = 10. The λP of the KL regularization coefﬁcient is 1.0 for both MMEL-H and MMEL-S. The λT in equation (8) for MMEL-S is selected from {0.5, 1.0, 2.0}. Table 7: Hyperparameters of the BERTBASE model.