Weakly Supervised Few-Shot Object Detection with DETR

Authors: Chenbo Zhang, Yinglu Zhang, Lu Zhang, Jiajia Zhao, Jihong Guan, Shuigeng Zhou

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments also show that the proposed method clearly outperforms the existing counterparts in the WS-FSOD task. We conduct extensive experiments on benchmark datasets, which show that the proposed method significantly outperforms the state-of-the-art methods, validating its effectiveness.
Researcher Affiliation Academia 1Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, China 2Science and Technology on Complex System Control and Intelligent Agent Cooperation Laboratory, Beijing Electro-Mechanical Engineering Institute, China 3Department of Computer Science & Technology, Tongji University, China
Pseudocode No The paper describes the proposed method in detail with figures, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper mentions referring to 'our supplementary material for more details' but does not explicitly state that source code is provided or offer a direct link to a code repository.
Open Datasets Yes Following the only previous work Star Net (Karlinsky et al. 2021) in WS-FSOD, we takes three benchmark datasets for evaluation: Image Net Loc-FS, CUB-200 and PASCAL VOC. For Image Net Loc-FS (Eli et al. 2019)... For CUB (Wah et al. 2011)... For PASCAL VOC (Everingham et al. 2010)...
Dataset Splits Yes For Image Net Loc-FS (Eli et al. 2019), we divide the total 331 classes into three sets: 101 base classes for base-training, 214 novel classes for fine-tuning and evaluation, and 16 classes for validation. For CUB (Wah et al. 2011), we split the 200 classes into three sets: 100 base classes for base-training, 50 novel classes for fine-tuning and evaluation, and 50 classes for validation.
Hardware Specification No The paper mentions the use of 'Deformable DETR... with swin transformer-s as backbone' but does not specify the hardware (e.g., GPU model, CPU, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'Deformable DETR' and 'swin transformer-s' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes LP = λP LALN + (1 λP )Ldis (10) where λP is the hyperparameter used to control the learning target. During the first half of pretraining, we set λP to 1 to train the ALN alone. During the second half of pretraining, we set λP to 0.5 to jointly train ALN and DETR... LR = Lmil + λ1 PK k=1Lk ref + λ2Lbox (11) where λ1,λ2 are the hyperparameters used to balance the loss function, and we set λ1=1,λ2=10. Other hyperparameters of the refinement structure are set following OICR (Tang et al. 2017) (e.g., K=3).