Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace
Authors: Yoonho Lee, Seungjin Choi
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed experiments to answer: Do our novel components (TW, M etc) improve metalearning performance? (6.1) Is applying a mask M row-wise actually better than applying one parameter-wise? (6.1) To what degree does T alleviate the need for careful tuning of step size α? (6.2) In MT-nets, does learned subspace dimension reflect the difficulty of tasks? (6.3) Can T-nets and MT-nets scale to large-scale metalearning problems? (6.4) |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Pohang University of Science and Technology, Korea. |
| Pseudocode | Yes | Algorithm 1 Transformation Networks (T-net); Algorithm 2 Mask Transformation Networks (MT-net) |
| Open Source Code | No | The paper mentions 'Most of our experiments were performed by modifying the code accompanying (Finn et al., 2017)', but it does not provide a link or explicit statement about the availability of their own source code. |
| Open Datasets | Yes | To compare the performance of MT-nets to prior work in meta-learning, we evaluate our method on few-shot classification on the Omniglot (Lake et al., 2015) and Mini Imagenet (Ravi & Larochelle, 2017) datasets. |
| Dataset Splits | No | The paper describes training and testing examples per task ('Each task consists of K {5, 10, 20} training examples and 10 testing examples') but does not explicitly mention a distinct validation dataset split with specific percentages or counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma & Ba, 2015)' as a meta-optimizer, but it does not specify software components or libraries with version numbers. |
| Experiment Setup | Yes | We used Adam (Kingma & Ba, 2015) as our meta-optimizer with a learning rate of β = 10 3. Taskspecifc learners used step size α = 10 2. We initialize all ζ to 0, all T as identity matrices, and all W as truncated normal matrices with standard deviation 10 2. |