reproducibilityindex.ai

Meta Dropout: Learning to Perturb Latent Features for Generalization

Authors: Hae Beom Lee, Taewook Nam, Eunho Yang, Sung Ju Hwang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our method on few-shot classiﬁcation datasets, whose results show that it signiﬁcantly improves the generalization performance of the base model, and largely outperforms existing regularization methods such as information bottleneck, manifold mixup, and information dropout.
Researcher Affiliation	Collaboration	Hae Beom Lee1, Taewook Nam1, Eunho Yang1,2, Sung Ju Hwang1,2 KAIST1,AITRICS2, South Korea {haebeom.lee,namsan,eunhoy,sjhwang82}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1 Meta-training and Algorithm 2 Meta-testing (Appendix A)
Open Source Code	No	The paper states "We used Tensor Flow (Abadi et al., 2016) for all our implementations." but does not provide a link or explicit statement about the availability of its own source code.
Open Datasets	Yes	We validate our method on the following two benchmark datasets for few-shot classiﬁcation. 1) Omniglot: This gray-scale hand-written character dataset consists of 1623 classes with 20 examples of size 28 28 for each class. Following the experimental setup of Vinyals et al. (2016), we use 1200 classes for meta-training, and the remaining 423 classes for meta-testing. ... 2) mini Image Net: This is a subset of ILSVRC-2012 (Deng et al., 2009), consisting of 100 classes with 600 examples of size 84 84 per each class.
Dataset Splits	Yes	Omniglot: Following the experimental setup of Vinyals et al. (2016), we use 1200 classes for meta-training, and the remaining 423 classes for meta-testing. ... mimi Image Net: There are 64, 16 and 20 classes for metatrain/validation/test respectively.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies	No	The paper states "We used Tensor Flow (Abadi et al., 2016) for all our implementations." but does not specify the version number of Tensor Flow or any other software dependencies.
Experiment Setup	Yes	Omniglot: For 1-shot classiﬁcation, we use the meta-batchsize of B = 8 and the inner-gradient stepsize of α = 0.1. For 5-shot classiﬁcation, we use B = 6 and α = 0.4. We train for total 40K iterations with meta-learning rate 10 3. mimi Image Net: We use B = 4 and α = 0.01. We train for 60K iterations with meta-learning rate 10 4. Both datasets: Each inneroptimization consists of 5 SGD steps for both meta-training and meta-testing. ... We use Adam optimizer (Kingma & Ba, 2014) with gradient clipping of [ 3, 3].