reproducibilityindex.ai

A Theoretical Analysis of the Number of Shots in Few-Shot Learning

Authors: Tianshi Cao, Marc T Law, Sanja Fidler

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce a theoretical analysis of the impact of the shot number on Prototypical Networks, a state-of-the-art few-shot classiﬁcation method. From our analysis, we propose a simple method that is robust to the choice of shot number used during meta-training, which is a crucial hyperparameter. The performance of our model trained for an arbitrary meta-training shot number shows great performance for different values of meta-testing shot numbers. We experimentally demonstrate our approach on different few-shot classiﬁcation benchmarks.
Researcher Affiliation	Collaboration	Tianshi Cao1,2, Marc T. Law1,2,3, Sanja Fidler1,2,3 1 Department of Computer Science, University of Toronto 2 Vector Institute 3 NVIDIA
Pseudocode	Yes	A.1 ALGORITHM FOR EST Algorithm 1 Algorithm for computing the transformation T.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	Experiments are performed on three data sets: Omniglot (Lake et al., 2015), mini Image Net (Vinyals et al., 2016), and tiered Image Net (Ren et al., 2018).
Dataset Splits	Yes	For mini Image Net experiments, we use the splits proposed by (Ravi & Larochelle, 2017) where 64 classes are used for training, 16 for validation, and 20 for testing. Tiered Image Net... In total, there are 351 classes in training, 97 in validation, and 160 in testing.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments. It only describes the network architectures.
Software Dependencies	No	The paper mentions the use of the Adam optimizer, but does not provide version numbers for any software components (e.g., Python, PyTorch, TensorFlow, specific libraries).
Experiment Setup	Yes	Adam (Kingma & Ba, 2014) optimizer is used with α = 0.9, β = 0.999, ϵ = 10 8, and an initial learning rate of 0.001 that is decayed by half every 2000 episodes. On Omniglot, we train with 60 classes and 5 query points per episode. On mini Image Net and tiered Image Net, we train with 20 classes and 15 query points per episode.