reproducibilityindex.ai

Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace

Authors: Yoonho Lee, Seungjin Choi

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed experiments to answer: Do our novel components (TW, M etc) improve metalearning performance? (6.1) Is applying a mask M row-wise actually better than applying one parameter-wise? (6.1) To what degree does T alleviate the need for careful tuning of step size α? (6.2) In MT-nets, does learned subspace dimension reﬂect the difﬁculty of tasks? (6.3) Can T-nets and MT-nets scale to large-scale metalearning problems? (6.4)
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, Pohang University of Science and Technology, Korea.
Pseudocode	Yes	Algorithm 1 Transformation Networks (T-net); Algorithm 2 Mask Transformation Networks (MT-net)
Open Source Code	No	The paper mentions 'Most of our experiments were performed by modifying the code accompanying (Finn et al., 2017)', but it does not provide a link or explicit statement about the availability of their own source code.
Open Datasets	Yes	To compare the performance of MT-nets to prior work in meta-learning, we evaluate our method on few-shot classiﬁcation on the Omniglot (Lake et al., 2015) and Mini Imagenet (Ravi & Larochelle, 2017) datasets.
Dataset Splits	No	The paper describes training and testing examples per task ('Each task consists of K {5, 10, 20} training examples and 10 testing examples') but does not explicitly mention a distinct validation dataset split with specific percentages or counts.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU models, CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Adam (Kingma & Ba, 2015)' as a meta-optimizer, but it does not specify software components or libraries with version numbers.
Experiment Setup	Yes	We used Adam (Kingma & Ba, 2015) as our meta-optimizer with a learning rate of β = 10 3. Taskspecifc learners used step size α = 10 2. We initialize all ζ to 0, all T as identity matrices, and all W as truncated normal matrices with standard deviation 10 2.