Improving Task-Specific Generalization in Few-Shot Learning via Adaptive Vicinal Risk Minimization

Authors: Long-Kai Huang, Ying Wei

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To verify the performance of the proposed method, we conduct experiments on three standard few-shot learning benchmarks and consolidate the superiority of the proposed method over state-of-the-art few-shot learning baselines.
Researcher Affiliation Collaboration Long-Kai Huang Tencent AI Lab hlongkai@gmail.com Ying Wei City University of Hong Kong yingwei@cityu.edu.hk
Pseudocode Yes Algorithm 1 Adaptive Vicinal Few-Shot Learning (ADV)
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes We use three benchmarks for performance evaluation: mini Image Net [39], CUB [40] and CIFARFS [2].
Dataset Splits Yes The learning rate is determined by performing a grid search from 0.001 to 1 on the tasks constructed by the meta-validation set.
Hardware Specification No The paper mentions a 'differentiable GPU-based QP solver' but does not specify any exact GPU/CPU models, memory amounts, or detailed computer specifications used for running experiments.
Software Dependencies No The paper mentions using a 'differentiable GPU-based QP solver [1]' but does not provide specific version numbers for any software components, libraries, or solvers.
Experiment Setup Yes For ADV-CE, we initialize the weights of the classifier by class prototypes and optimize the vicinal loss in (5) by gradient descent for 100 steps. The learning rate is determined by performing a grid search from 0.001 to 1 on the tasks constructed by the meta-validation set. For ADV-SVM, we solve the QP in (9) by using a differentiable GPU-based QP solver [1]. The regularization parameter λ is set to 0.1 and the parameter σ for RBF kernel is obtained via grid search from 0.1 to 10. In the lazy random walk algorithm, the number of steps T, the lazy stay probability β and the temperature are obtained via grid search in {1, 2, 3, 4, 5}, {0.1, 0.2, 0.5}, {0.01, 0.1, 1, 10}, respectively.