Re-ranking for image retrieval and transductive few-shot classification

Authors: Xi SHEN, Yang Xiao, Shell Xu Hu, Othman Sbai, Mathieu Aubry

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 Experiments In this section, we cover our experimental setups and results for image retrieval and few-shot image classification. Since these two problems are different in data processing and performance evaluation, we separate the discussions into two sub-sections followed by a joint ablation study.
Researcher Affiliation Collaboration Xi Shen1, Yang Xiao2, Shell Xu Hu3, Othman Sbai4, and Mathieu Aubry5 1, 2, 4, 5LIGM (UMR 8049), École des Ponts Paris Tech 3Samsung AI Center, Cambridge
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://imagine.enpc.fr/~shenx/SSR/.
Open Datasets Yes We consider five image retrieval datasets, namely, CUB [67], CARS [31], SOP [62], r Oxford5K [49] and r Paris6K [49]. mini-Image Net [66], tiered-Image Net [51], and CIFAR-FS [4].
Dataset Splits Yes for CUB, the first 100 species (5,864 images) are used for training and the remaining 100 species (5,924 images) are used for testing; for CARS, the first 98 classes (8,054 images) are used for training and the other 98 classes (8,131 images) are kept for testing; for SOP, the dataset is separated into 11,318 training classes (59,551 images) and 11,316 testing classes (60 502 images). mini-Image Net contains 100 classes and 600 images per class. It is split into 64 classes for training, 16 for validation and 10 for testing. tiered-Image Net contains a larger subset of Image Net with 608 classes and 1 300 images per class. It is split into 351 classes for training, 97 for validation and 160 for testing. CIFAR-FS was created by dividing the original CIFAR-100 [32] into 64 training classes, 16 validation classes and 20 testing classes.
Hardware Specification Yes The entire training on CUB [67] takes 6 hours on a single Ge Force 1080 Ti GPU. For few-shot classification, we first train for 30K iterations with T = 1: the learning rate is set to 0.1 for 5K iterations then to 0.01 for another 25K iterations. Then, keeping a learning rate of 0.01, we train for 10K iterations with T = 2 and 10K more with T = 3. We find that T = 3 leads to the most stable improvement and include this analysis in the supplementary material. The whole training process on mini-Image Net [66] takes 20 hours on a single Ge Force 1080 Ti GPU.
Software Dependencies No The paper mentions software components like 'Re LU activations' and 'Instance Normalization' but does not specify version numbers for any libraries or frameworks used (e.g., PyTorch, TensorFlow).
Experiment Setup Yes Each subgraph update in our SSR module is performed by a three-layer perceptron with constant hidden-layer size 1,024 for image retrieval and 4,096 for few-shot classification. We optimize our networks using SGD with momentum 0.9. The batch size is set to 1. For image retrieval, we use a single update of the model (T = 1) and training converges in 10K iterations with a fixed learning rate of 1e-5. For few-shot classification, we first train for 30K iterations with T = 1: the learning rate is set to 0.1 for 5K iterations then to 0.01 for another 25K iterations. Then, keeping a learning rate of 0.01, we train for 10K iterations with T = 2 and 10K more with T = 3.