Boosting Few-Shot Open-Set Recognition with Multi-Relation Margin Loss
Authors: Yongjuan Che, Yuexuan An, Hui Xue
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on public benchmarks reveal that methods with MRM loss can improve the unknown detection of AUROC by a significant margin while correctly classifying the known classes. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China 2MOE Key Laboratory of Computer Network and Information Integration (Southeast University), China {yjche, yx an, hxue}@seu.edu.cn |
| Pseudocode | No | The paper describes the method using mathematical formulations and text, but it does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at https://github.com/Casie-che/MRM. |
| Open Datasets | Yes | We conduct experiments on three public benchmark datasets CUB-200 [Wah et al., 2011], tiered Image Net [Ren et al., 2018], and mini Image Net [Vinyals et al., 2016] to verify the effectiveness of MRM. |
| Dataset Splits | Yes | During training, We use the validation set to select the best model. ... Following [Liu et al., 2020b], we set N = 5 and K = 1, 5 during meta-training and meta-testing. |
| Hardware Specification | No | The paper mentions using ResNet12 architecture but does not specify any hardware details like GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions using ResNet12 and an SGD optimizer but does not provide specific version numbers for any software libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | The initial learning rate is set to 0.0002 for the feature extractor and 0.002 for transformers with a multistep learning rate schedule. MRM is finetuning the feature extractor over 30 epochs with 0.0005 weight decay. ... We train the network parameters θ firstly while the radius R is fixed, then after one epoch, we calculate the radius for each class based on the embeddings extracted from the network of the latest update. ... We λ = 0.1, α = 1, β = 3, m = 1 for the loss function in Eq.(1), and ν = 0.1 for hypersphere updating in Eq.(7). |