reproducibilityindex.ai

HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

Authors: Andrey Zhmoginov, Mark Sandler, Maksym Vladymyrov

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present HYPERTRANSFORMER (HT) experimental results and discuss the implications of our empirical ﬁndings. ... Table 1. Comparison of HT with MAML++ and RFS on models of different sizes and different datasets: (a) 20-way OMNIGLOT, (b) 5-way MINIIMAGENET and (c) 5-way TIEREDIMAGENET.
Researcher Affiliation	Industry	Andrey Zhmoginov 1 Mark Sandler 1 Max Vladymyrov 1 ... 1Google Research. Correspondence to: Andrey Zhmoginov <azhmogin@google.com>.
Pseudocode	No	The paper describes algorithmic ideas but does not contain any structured pseudocode or algorithm blocks clearly labeled as such.
Open Source Code	Yes	The code for the paper can be found at https://github.com/google-research/googleresearch/tree/master/hypertransformer.
Open Datasets	Yes	For our experiments, we chose several most widely used few-shot datasets including OMNIGLOT, MINIIMAGENET and TIEREDIMAGENET.
Dataset Splits	No	The paper describes how tasks are sampled for training and evaluation (e.g., 'each training task t Ttrain is sampled by ﬁrst randomly choosing n distinct classes Ct from a large training dataset and then sampling examples without replacement from these classes to generate τ(t) and Q(t).'), but it does not provide specific percentages or counts for training, validation, and test splits of the overall datasets like OMNIGLOT, MINIIMAGENET, and TIEREDIMAGENET.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	In all our experiments, we used gradient descent optimizer with a learning rate in the 0.01 to 0.02 range. Our early experiments with more advanced optimizers were unstable. We used a learning rate decay schedule, in which we reduced the learning rate by a factor of 0.95 every 10^5 learning steps. ... For all tasks except 5-shot MINIIMAGENET our Transformer had 3 layers... The 5-shot MINIIMAGENET and TIEREDIMAGENET results presented in Table 1 were obtained with a simpliﬁed Transformer model that had 1 layer...