Weakly Supervised Few-Shot Object Detection with DETR
Authors: Chenbo Zhang, Yinglu Zhang, Lu Zhang, Jiajia Zhao, Jihong Guan, Shuigeng Zhou
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments also show that the proposed method clearly outperforms the existing counterparts in the WS-FSOD task. We conduct extensive experiments on benchmark datasets, which show that the proposed method significantly outperforms the state-of-the-art methods, validating its effectiveness. |
| Researcher Affiliation | Academia | 1Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, China 2Science and Technology on Complex System Control and Intelligent Agent Cooperation Laboratory, Beijing Electro-Mechanical Engineering Institute, China 3Department of Computer Science & Technology, Tongji University, China |
| Pseudocode | No | The paper describes the proposed method in detail with figures, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions referring to 'our supplementary material for more details' but does not explicitly state that source code is provided or offer a direct link to a code repository. |
| Open Datasets | Yes | Following the only previous work Star Net (Karlinsky et al. 2021) in WS-FSOD, we takes three benchmark datasets for evaluation: Image Net Loc-FS, CUB-200 and PASCAL VOC. For Image Net Loc-FS (Eli et al. 2019)... For CUB (Wah et al. 2011)... For PASCAL VOC (Everingham et al. 2010)... |
| Dataset Splits | Yes | For Image Net Loc-FS (Eli et al. 2019), we divide the total 331 classes into three sets: 101 base classes for base-training, 214 novel classes for fine-tuning and evaluation, and 16 classes for validation. For CUB (Wah et al. 2011), we split the 200 classes into three sets: 100 base classes for base-training, 50 novel classes for fine-tuning and evaluation, and 50 classes for validation. |
| Hardware Specification | No | The paper mentions the use of 'Deformable DETR... with swin transformer-s as backbone' but does not specify the hardware (e.g., GPU model, CPU, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Deformable DETR' and 'swin transformer-s' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | LP = λP LALN + (1 λP )Ldis (10) where λP is the hyperparameter used to control the learning target. During the first half of pretraining, we set λP to 1 to train the ALN alone. During the second half of pretraining, we set λP to 0.5 to jointly train ALN and DETR... LR = Lmil + λ1 PK k=1Lk ref + λ2Lbox (11) where λ1,λ2 are the hyperparameters used to balance the loss function, and we set λ1=1,λ2=10. Other hyperparameters of the refinement structure are set following OICR (Tang et al. 2017) (e.g., K=3). |