reproducibilityindex.ai

LEEP: A New Measure to Evaluate Transferability of Learned Representations

Authors: Cuong Nguyen, Tal Hassner, Matthias Seeger, Cedric Archambeau

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to evaluate our LEEP measure in several scenarios. We show that the measure is useful for predicting the performance of two commonly used transfer learning algorithms head classiﬁer re-training (Donahue et al., 2014; Razavian et al., 2014) and model ﬁne-tuning (Agrawal et al., 2014; Girshick et al., 2014) not only for large target data sets, but also for small or imbalanced target data sets that are difﬁcult to use for re-training.We evaluate the ability of LEEP to predict the performance of transfer and meta-transfer learning algorithms, prior to applying these algorithms in practice. We further show that LEEP is useful even in the small or imbalanced data settings, where training on the target task could be hard. We compare LEEP with the state of the art NCE transferability measure of Tran et al. (2019) and H score of Bao et al. (2019). Finally, we demonstrate the use of LEEP for source model selection. Our experiments are implemented in Gluon/MXNet (Chen et al., 2015; Guo et al., 2019).
Researcher Affiliation	Industry	1Amazon Web Services 2Facebook AI (Work done before joining Facebook).
Pseudocode	No	The paper describes the steps of the LEEP measure verbally and mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link regarding the public availability of the source code for the described methodology.
Open Datasets	Yes	Image Net (Russakovsky et al., 2015) and Res Net20 (He et al., 2016), which is pre-trained on CIFAR10 (Krizhevsky, 2009). For each model, we construct 200 different target tasks from the CIFAR100 data set (Krizhevsky, 2009)We further add experiments where target data sets are constructed from the Fashion MNIST data set (Xiao et al., 2017).
Dataset Splits	No	The paper mentions training and test sets and specific numbers of examples for small data scenarios (e.g.,
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	Our experiments are implemented in Gluon/MXNet (Chen et al., 2015; Guo et al., 2019). (No version numbers provided for reproducibility.)
Experiment Setup	Yes	In all tests, we ran SGD for 100 epochs with learning rate 0.01 and batch size 10 since they were sufﬁcient to obtain good transferred models.