Meta Evidential Transformer for Few-Shot Open-Set Recognition
Authors: Hitesh Sapkota, Krishna Prasad Neupane, Qi Yu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on real-world datasets demonstrate consistent improvement over existing competitive methods in unseen class recognition without deteriorating closed-set performance. |
| Researcher Affiliation | Collaboration | Hitesh Sapkota 1 2 Krishna Prasad Neupane 1 2 Qi Yu 2 1Amazon Inc. (Work was done at RIT, which is not relate to the position at Amazon) 2Rochester Institute of Technology (RIT). |
| Pseudocode | Yes | Algorithm 1 shows the overall training process of our proposed MET technique. As shown we use both open-set as well as closed-set loss to optimize the network parameters. Algorithm 2 shows the corresponding inference algorithm that leverages the cross-attention mechanism. |
| Open Source Code | Yes | F. Source Code For the source code, please click here. |
| Open Datasets | Yes | We conduct experimentation on multiple datasets, including Mini Image Net (Vinyals et al., 2016), Tired Image Net (Ren et al., 2018), Cifar100 (Krizhevsky et al., 2009), and Caltech101 (Fei-Fei et al., 2004). |
| Dataset Splits | Yes | Table 5 shows the dataset splits for four datasets: Mini Image Net, Tiered Image Net, Cifar100, and Caltech101. ... Table 5. Train/Evaluation/Test partition on different datasets. Split Mini Image Net ... Train Eval Test ... Closed-set 46 10 9 |
| Hardware Specification | No | The paper mentions using 'Res Net-12 as a backbone architecture' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions using 'stochastic gradient descent (SGD)' and ResNet-12 but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For the training, stochastic gradient descent (SGD) is used with a total of 200 epochs. The initial learning rate of 0.002 is set and is decreased by 10% at an interval of every 20 epochs. The weight decay is set to 0.005 and λ is set to 1 throughout the experimentation. |