Structured Prediction for Conditional Meta-Learning

Authors: Ruohan Wang, Yiannis Demiris, Carlo Ciliberto

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that TASML improves the performance of existing meta-learning models, and outperforms the state-of-the-art on benchmark datasets. We empirically evaluate TASML on several competitive few-shot classification benchmarks, including datasets derived from IMAGENET and CIFAR respectively.
Researcher Affiliation Academia Ruohan Wang, Yiannis Demiris, Carlo Ciliberto Dept. of Electrical and Electronic Engineering Imperial College London London, UK {r.wang16,y.demiris,c.ciliberto}@imperial.ac.uk
Pseudocode Yes Algorithm 1 TASML
Open Source Code Yes TASML implementation is available at https://github.com/RuohanW/Tasml
Open Datasets Yes We empirically evaluate TASML on several competitive few-shot classification benchmarks, including datasets derived from IMAGENET and CIFAR respectively. We perform experiments1 on C-way-K-shot learning within the episodic formulation of [53]. ... We evaluate the proposed method against a wide range of meta-learning algorithms on three few-shot learning benchmarks: the mini IMAGENET, tiered IMAGENET and CIFAR-FS datasets.
Dataset Splits Yes For training, validation and testing, we sample three separate meta-datasets Str, Sval and Sts, each accessing a disjoint set of classes (e.g. no class in Sts appears in Str or Sval). Dval contains samples from the same C classes for estimating model generalization and training meta-learner.
Hardware Specification Yes Tab. 4, which reports the average number of meta-gradient steps per second on a single Nvidia GTX 2080.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies (e.g., libraries, frameworks like PyTorch).
Experiment Setup Yes We consider the commonly used 5-way-1-shot and 5-way-5-shot settings. In our experiments we chose M to be 1% of N. Appendix B reports further experimental details including network specification and hyperparameter choice.