Smoothed Embeddings for Certified Few-Shot Learning
Authors: Mikhail Pautov, Olesya Kuznetsova, Nurislam Tursynbek, Aleksandr Petiushko, Ivan Oseledets
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical results are confirmed by experiments on different datasets.Our contributions are summarized as follows:We provide the first theoretical robustness guarantee for few-shot learning classification.Analysis of Lipschitz continuity of such models and providing the robustness certificates against 2 bounded perturbations for few-shot learning scenarios.We propose to estimate confidence intervals not for distances between the approximation of smoothed embedding and class prototype but for the dot product of vectors which has expectation equal to the distance between actual smoothed embedding and class prototype.5 Experiments |
| Researcher Affiliation | Collaboration | Mikhail Pautov Skolkovo Institute of Science and TechnologyOlesya Kuznetsova Skolkovo Institute of Science and TechnologyNurislam Tursynbek The University of North Carolina at Chapel HillAleksandr Petiushko Moscow State University, Nuro, Inc.Ivan Oseledets Skolkovo Institute of Science and Technology, AIRI |
| Pseudocode | Yes | Algorithm 1 Closest prototype computation algorithm.Algorithm 2 Adversarial embedding risk computation algorithm. |
| Open Source Code | Yes | The code for this paper is available at https://github.com/koava36/certrob-fewshot. |
| Open Datasets | Yes | For the experimental evaluation of our approach we use several well-known datasets for few-shot learning classification. Cub-200-2011 [45] is a dataset with 11, 788 images of 200 bird species, where 5864 images of 100 species are in the train subset and 5924 images of other 100 species are in the test subset. mini Image Net [44] is a substet of images from ILSVRC 2015 [37] dataset with 64 images categories in train subset, 16 categories in validation subset and 20 categories in test subset with 600 images of size 84 84 in each category. CIFAR FS [1] is a subset of CIFAR 100 [19] dataset which was generated in the same way as mini Image Net and contains 37800 images of 64 categories in the train set and 11400 images of 20 categories in the test set. |
| Dataset Splits | Yes | Cub-200-2011 [...] where 5864 images of 100 species are in the train subset and 5924 images of other 100 species are in the test subset. mini Image Net [...] with 64 images categories in train subset, 16 categories in validation subset and 20 categories in test subset with 600 images of size 84 84 in each category. CIFAR FS [...] contains 37800 images of 64 categories in the train set and 11400 images of 20 categories in the test set. |
| Hardware Specification | Yes | In the table below, we report the computation time of the certification procedure per image on Tesla V100 GPU for Cub-200-2011 dataset. |
| Software Dependencies | No | The paper mentions using a 'prototypical network introduced in [39] with Conv Net-4 backbone' and applying 'Gaussian noise' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Parameters of expeiments. For data augmentation, we applied Gaussian noise with zero mean, unit variance and probability 0.3 of augmentation. Each dataset was certified on a subsample of 500 images with default parameters for the Algorithm 1: number of samples n = 1000, confidence level = 0.001 and variance σ = 1.0, unless stated otherwise. For our settings, it may be shown from simple geometry that values (ai, bi) from (9) are such that bi ai 4 so we use bi ai = 4. The maximum number of samples T in the Algorithm 1 is set to be T = 5 105. |