reproducibilityindex.ai

Learning What and Where to Transfer

Authors: Yunhun Jang, Hankook Lee, Sung Ju Hwang, Jinwoo Shin

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our meta-transfer approach against recent transfer learning methods on various datasets and network architectures, on which our automated scheme signiﬁcantly outperforms the prior baselines that ﬁnd what and where to transfer in a hand-crafted manner. Section 3 shows our experimental results under various settings
Researcher Affiliation	Collaboration	1School of Electrical Engineering, KAIST, Korea 2OMNIOUS, Korea 3School of Computing, KAIST, Korea 4Graduate School of AI, KAIST, Korea 5AITRICS, Korea.
Pseudocode	Yes	Algorithm 1 Learning of θ with meta-parameters φ
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	For 32 32 scale, we use the Tiny Image Net1 dataset as a source task, and CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and STL-10 (Coates et al., 2011) datasets as target tasks. ... For 224 224 scale, the Image Net (Deng et al., 2009) dataset is used as a source dataset, and Caltech-UCSD Bird 200 (Wah et al., 2011), MIT Indoor Scene Recognition (Quattoni & Torralba, 2009), Stanford 40 Actions (Yao et al., 2011) and Stanford Dogs (Khosla et al., 2011) datasets as target tasks.
Dataset Splits	No	The paper mentions using N training samples per class for CIFAR-10 but does not specify clear training/validation/test splits, beyond implicitly using the remainder of the dataset for testing if not explicitly stated.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	Our ﬁnal loss Ltotal to train a target model then is given as: Ltotal(θ\|x, y, φ) = Lorg(θ\|x, y) + βLwfm(θ\|x, φ). where Lorg is the original loss (e.g., cross entropy) and β > 0 is a hyper-parameter. We choose T = 2 in our experiments and Algorithm 1 Learning of θ with meta-parameters φ Input: Dataset Dtrain = {(xi, yi)}, learning rate α repeat Sample a batch B Dtrain with \|B\| = B. For all experiments, we construct the meta-networks as 1-layer fully-connected networks for each pair (m, n) C where C is the set of candidates of pairs, or matching conﬁguration (see Figure 3). It takes the globally average pooled features of the mth layer of the source network as an input, and outputs wm,n c and λm,n. As for the channel assignments w, we use the softmax activation to generate them while satisfying P c wm,n c = 1, and for transfer amount λ between layers, we commonly use Re LU6 (Krizhevsky & Hinton, 2010), max(0, min(6, x)) to ensure non-negativeness of λ and to prevent λm,n from becoming too large.