POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples

Authors: Duong Le, Khoi Duc Nguyen, Khoi Nguyen, Quoc-Huy Tran, Rang Nguyen, Binh-Son Hua

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on various standard benchmarks demonstrate that the proposed method consistently improves the performance of pretrained networks with different architectures.
Researcher Affiliation Collaboration Duong H. Le Vin AI Research v.duonglh5@vinai.io Khoi D. Nguyen Vin AI Research khoinguyenucd@gmail.com Khoi Nguyen Vin AI Research ducminhkhoi@gmail.com Quoc-Huy Tran Retrocausal, Inc. huy@retrocausal.ai Rang Nguyen Vin AI Research rangnhm@gmail.com Binh-Son Hua Vin AI Research & Vin University
Pseudocode Yes Please see the supplemental document for the pseudo-code (Section C).
Open Source Code Yes Our code is available at https://github.com/Vin AIResearch/poodle.
Open Datasets Yes The mini-Imagenet dataset [56] consists of 100 classes chosen from the Image Net dataset [48]... The tiered-Imagenet [46] is another FSL dataset... Caltech-UCSD Birds (CUB) has 200 classes... Furthermore, we also carry out experiments on i Naturalist 2017 (i Nat) [54], Euro SAT [22], and ISIC-2018 (ISIC) [7]...
Dataset Splits Yes The mini-Imagenet dataset [56] consists of 100 classes chosen from the Image Net dataset [48] including 64 training, 16 validation, and 20 test classes... The tiered-Imagenet [46] is another FSL dataset... with 351 base, 97 validation, and 160 test classes... Caltech-UCSD Birds (CUB) has 200 classes split into 100, 50, 50 classes for train, validation and test following [6].
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software components and frameworks (e.g., ResNet12, Adam optimizer), but it does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes For pre-training on the base classes, we train our backbones with the standard cross-entropy loss for 100 epochs. The optimizer has a weight decay of 5e 4, and the initial learning rate of 0.05 is decreased by a factor of 10 after 60, 80 epochs in mini-Image Net and 60, 80, 90 epochs in tiered-Image Net. We use the batch size of 64 for all the networks. For fine-tuning on the novel classes, we utilize Adam optimizer [32] with fixed learning rate of 0.001, β1 = 0.9, β2 = 0.999, and do not use weight decay. The classifier is trained with 250 iterations. The coefficients of push/pull loss are = 1 and β = 0.5 respectively.