Context-Aware Meta-Learning

Authors: Christopher Fifty, Dennis Duan, Ronald Guenther Junkins, Ehsan Amid, Jure Leskovec, Christopher Re, Sebastian Thrun

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On 8 out of 11 few-shot image classification benchmarks, our approach without meta-training or fine-tuning exceeds or matches the state-of-the-art algorithm, P>M>F, which is meta-trained on these benchmarks.
Researcher Affiliation Collaboration 1Stanford University, 2Google, 3Google Deep Mind fifty@cs.stanford.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/cfifty/CAML.
Open Datasets Yes we pre-train CAML s non-causal sequence model on few-shot image classification tasks from Image Net-1k (Deng et al., 2009), Fungi (Schroeder & Cui, 2018), MSCOCO (Lin et al., 2014), and Wiki Art (Saleh & Elgammal, 2015).
Dataset Splits No The paper mentions using early stopping during pre-training, which implies a validation process, and refers to 'train/validation splits of meta-learning benchmarks' for other methods. However, it does not explicitly provide the specific percentages or sample counts for training/validation/test splits for any dataset used in its own experiments.
Hardware Specification No The paper mentions model architectures like Vi T-base and Vi T-Large, but does not specify any hardware details such as GPU models, CPU types, or cloud computing resources used for experiments.
Software Dependencies No The paper mentions using Hugging Face models and general optimization settings, but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes When pre-training all models in the universal setting, we set the learning rate to a fixed 1 × 10−5 and do not perform any hyperparameter tuning in order to match the practices used by P>M>F. We use early stopping with a window size of 10 epochs during pre-training and the code release of Hu et al. (2022) to benchmark P>M>F with the training settings and hyperparameters described in their work. We select a batch size of 525 so the 5-way-1-shot episodes contain 520 query predictions and the 5-way-5-shot episodes contain 500 query predictions.