Context-Aware Meta-Learning
Authors: Christopher Fifty, Dennis Duan, Ronald Guenther Junkins, Ehsan Amid, Jure Leskovec, Christopher Re, Sebastian Thrun
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On 8 out of 11 few-shot image classification benchmarks, our approach without meta-training or fine-tuning exceeds or matches the state-of-the-art algorithm, P>M>F, which is meta-trained on these benchmarks. |
| Researcher Affiliation | Collaboration | 1Stanford University, 2Google, 3Google Deep Mind fifty@cs.stanford.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/cfifty/CAML. |
| Open Datasets | Yes | we pre-train CAML s non-causal sequence model on few-shot image classification tasks from Image Net-1k (Deng et al., 2009), Fungi (Schroeder & Cui, 2018), MSCOCO (Lin et al., 2014), and Wiki Art (Saleh & Elgammal, 2015). |
| Dataset Splits | No | The paper mentions using early stopping during pre-training, which implies a validation process, and refers to 'train/validation splits of meta-learning benchmarks' for other methods. However, it does not explicitly provide the specific percentages or sample counts for training/validation/test splits for any dataset used in its own experiments. |
| Hardware Specification | No | The paper mentions model architectures like Vi T-base and Vi T-Large, but does not specify any hardware details such as GPU models, CPU types, or cloud computing resources used for experiments. |
| Software Dependencies | No | The paper mentions using Hugging Face models and general optimization settings, but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | When pre-training all models in the universal setting, we set the learning rate to a fixed 1 × 10−5 and do not perform any hyperparameter tuning in order to match the practices used by P>M>F. We use early stopping with a window size of 10 epochs during pre-training and the code release of Hu et al. (2022) to benchmark P>M>F with the training settings and hyperparameters described in their work. We select a batch size of 525 so the 5-way-1-shot episodes contain 520 query predictions and the 5-way-5-shot episodes contain 500 query predictions. |