Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Boosting Few-Shot Open-Set Object Detection via Prompt Learning and Robust Decision Boundary
Authors: Zhaowei Wu, Binyi Su, Qichuan Geng, Hua Zhang, Zhong Zhou
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the superiority of our method on both known and unknown classes. Extensive experiments demonstrate the effectiveness of our FOOD method, which outperforms previous visiononly frameworks and achieves superior unknown rejection performance. |
| Researcher Affiliation | Academia | 1State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, China 2The School of Artificial Intelligence and Data Science, Hebei University of Technology, China 3China Xiongan Group Digital City Technology Co., Ltd. 4Information Engineering College, Capital Normal University, China 5Institute of Information Engineering, Chinese Academy of Sciences 6Zhongguancun Laboratory, Beijing, China EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods using prose and mathematical equations, but does not include a dedicated section for pseudocode or algorithm blocks. Figure 3 provides an architectural overview but is a diagram, not pseudocode. |
| Open Source Code | Yes | Our source code is available at https://gitee.com/VR NAVE/ced-food. |
| Open Datasets | Yes | Following [Su et al., 2024], the data splits VOC10-5-5, VOC-COCO, and COCO-Road Anomaly [Lis et al., 2019] are used for performance evaluation. VOC10-5-5 includes 10 base, 5 novel, and 5 unknown classes from PASCAL VOC [Everingham et al., 2010]. VOC-COCO has 20 base classes from PASCAL VOC, 20 novel classes from nonoverlapping MS COCO [Lin et al., 2014], and 40 unknown classes. |
| Dataset Splits | Yes | Following [Su et al., 2024], the data splits VOC10-5-5, VOC-COCO, and COCO-Road Anomaly [Lis et al., 2019] are used for performance evaluation. VOC10-5-5 includes 10 base, 5 novel, and 5 unknown classes from PASCAL VOC [Everingham et al., 2010]. We report the results of fine-tuning on 1, 3, 5, and 10 shots, averaging ten runs per setting for a fairer comparison. |
| Hardware Specification | Yes | We employ SGD with 0.9 momentum, 5e-5 weight decay, and a batch size of 1 on a GTX 1080 Ti GPU. |
| Software Dependencies | No | We employ Region CLIP [Zhong et al., 2022] as the image encoder, and Res Net-50 [He et al., 2016] pre-trained on Image Net as the RPN image encoder. Class-specific prompt training follows Co Op [Zhou et al., 2022b] with a context length of 16... The paper mentions software components and frameworks but does not specify their version numbers for reproducibility. |
| Experiment Setup | Yes | Class-specific prompt training follows Co Op [Zhou et al., 2022b] with a context length of 16, using a two-stage training strategy [Wang et al., 2020]... We employ SGD with 0.9 momentum, 5e-5 weight decay, and a batch size of 1 on a GTX 1080 Ti GPU. The learning rate is 0.0002 for base training and 0.0001 for fine-tuning. Visual alignment loss settings follow the setting in [Han et al., 2022]. Other hyperparameters τ, ε, λ, and β are 0.01, 0.1, 1e-4, and 1.0, respectively. |