Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
One Meta-tuned Transformer is What You Need for Few-shot Learning
Authors: Xu Yang, Huaxiu Yao, Ying Wei
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Meta Former demonstrates coherence and compatibility with off-the-shelf pre-trained vision transformers and shows significant improvements in both inductive and transductive few-shot learning scenarios, outperforming state-of-the-art methods by up to 8.77% and 6.25% on 12 in-domain and 10 cross-domain datasets, respectively. |
| Researcher Affiliation | Academia | 1City University of Hong Kong 2University of North Carolina at Chapel Hill 3Nanyang Technological University. Correspondence to: Xu Yang < EMAIL>, Ying Wei <EMAIL>. |
| Pseudocode | No | The paper does not include any explicitly labeled pseudocode or algorithm blocks. Figures depict architectures and processes but are not in a code-like format. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, such as 'Our code is available at...' or provide a link to a code repository. |
| Open Datasets | Yes | We train and evaluate our Meta Former on the four standard few-shot benchmarks: mini Image Net (Vinyals et al., 2016b), tiered Image Net (Ren et al., 2018b), CIFARFS (Bertinetto et al., 2019) and FC-100 (Oreshkin et al., 2018). |
| Dataset Splits | Yes | In all experiments, we follow the standard data usage specifications same as Hiller et al. (2022), splitting data into the meta-training set, meta-validation set, and meta-test set, and classes in each set are mutually exclusive. [...] The classes are divided into 64, 16, and 20 for training, validation, and test, respectively. |
| Hardware Specification | Yes | The evaluation of inference latency is conducted on an NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions using 'the SGD optimizer' but does not specify version numbers for programming languages, libraries (e.g., PyTorch, TensorFlow), or other software components necessary for reproduction. |
| Experiment Setup | Yes | We use the image resolution of 224 224 and the output is projected to 8192 dimensions. A patch size of 16 and window size of 7 are used... A batch size of 512 and a cosine-decaying learning rate schedule are used. [...] We employ the SGD optimizer, utilizing a cosine-decaying learning rate initiated at 2 10 4, a momentum value of 0.9, and a weight decay of 5 10 4 across all datasets. The input image size is set to 224 224 for Meta Former and 360 360 for SMKD-Meta Former. Typically, training is conducted for a maximum of 200 epochs. |