LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning

Authors: Huaiyu Li, Weiming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Bao-Gang Hu

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on Omniglot and mini Image Net datasets demonstrate that LGM-Net can effectively adapt to similar unseen tasks and achieve competitive performance, and the results on synthetic datasets show that transferable prior knowledge is learned by the Meta Net module via mapping training data to functional weights.
Researcher Affiliation Collaboration 1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China 2University of Chinese Academy of Sciences, Beijing 100049, China 3Snap Inc. 4Kwai Inc. 5Youtu Lab, Tencent.
Pseudocode Yes Algorithm 1 The training algorithm of LGM-Net for N-way K-shot problems
Open Source Code Yes Our source code is available online2. 2https://github.com/likesiwell/LGM-Net/
Open Datasets Yes The Omniglot dataset consists of 1623 characters (classes) with 20 samples for each class from 50 different alphabets. Following (Vinyals et al., 2016; Snell et al., 2017), we randomly select 1200 classes as meta training dataset and use the remaining 423 classes as meta test dataset. The mini Image Net dataset, originally proposed by (Vinyals et al., 2016), consists of 60,000 images from 100 selected Image Net classes, each having 600 examples.
Dataset Splits Yes Formally, we have three datasets, i.e., a meta training dataset Dmeta-train, a meta validation dataset Dmeta-val, and a meta test dataset Dmeta-test. We use the meta training dataset to train our model and the meta validation dataset for model selection. We follow the split introduced by (Ravi & Larochelle, 2017), with 64, 16, and 20 classes for training, validation, and test, respectively.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments, only mentioning the use of TensorFlow for implementation.
Software Dependencies No The paper mentions using 'TensorFlow' but does not specify a version number or other software dependencies with specific version numbers.
Experiment Setup Yes Adam optimization (Kingma & Ba, 2015) is applied for training, with an initial learning rate of 10-3 which is reduced by 10% every 1500 batches. The models are trained end-to-end from scratch without any additional dataset.