Towards Practical Few-shot Query Sets: Transductive Minimum Description Length Inference
Authors: Ségolène Martin, Malik Boudiaf, Emilie Chouzenoux, Jean-Christophe Pesquet, Ismail Ayed
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments over the standard few-shot datasets and the more realistic and challenging i-Nat dataset show highly competitive performances of our method, more so when the numbers of possible classes in the tasks increase. |
| Researcher Affiliation | Academia | Ségolène Martin Université Paris-Saclay, Inria, Centrale Supélec, CVN Malik Boudiaf ÉTS Montreal Emilie Chouzenoux Université Paris-Saclay, Inria, Centrale Supélec, CVN Jean-Christophe Pesquet Université Paris-Saclay, Inria, Centrale Supélec, CVN Ismail Ben Ayed ÉTS Montreal |
| Pseudocode | Yes | Algorithm 1: Prim Al Dual Minimum Description LEngth (PADDLE) |
| Open Source Code | Yes | Our code is publicly available at https://github.com/Segolene Martin/PADDLE. |
| Open Datasets | Yes | We deployed three datasets for few-shot classification: mini-Imagenet [38], tiered Imagenet [18], and i-Nat [23]. |
| Dataset Splits | Yes | We followed the standard split of 64 classes for base training, 16 for validation, and 20 for testing [39, 25]. |
| Hardware Specification | No | The paper states that 'Both methods are run on the same machine' but does not provide any specific details about the hardware used (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper describes some aspects of the training process, such as 'standard cross-entropy minimization with label smoothing', but it does not specify software dependencies with version numbers (e.g., PyTorch 1.x, CUDA 11.x). |
| Experiment Setup | Yes | The label smoothing parameter is set to 0.1, for 90 epochs, using a learning rate initialized to 0.1 and divided by 10 at epochs 45 and 66. We use batch sizes of 256 for Res Net-18 and of 128 for WRN28-10. The images are resized to 84 84 pixels, both at training and evaluation time. Color jittering, random croping, and random horizontal flipping augmentations are applied during training. |