Learning Task-Aware Language-Image Representation for Class-Incremental Object Detection
Authors: Hongquan Zhang, Bin-Bin Gao, Yi Zeng, Xudong Tian, Xin Tan, Zhizhong Zhang, Yanyun Qu, Jun Liu, Yuan Xie
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on COCO 2017 and Pascal VOC 2007 and demonstrate that the proposed method achieves state-of-the-art results under the various CIOD settings. |
| Researcher Affiliation | Collaboration | 1East China Normal University 2Tencent You Tu Lab 3Chongqing Institute of East China Normal University 4 Xiamen University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We conduct extensive experiments on COCO 2017 and Pascal VOC 2007 and demonstrate that the proposed method achieves state-of-the-art results under the various CIOD settings. The proposed method is evaluated on two benchmark datasets, i.e., Pascal VOC 2007 and Microsoft COCO 2017. |
| Dataset Splits | Yes | VOC 2007 has 20 object classes, and we use the trainval subset for training and the test subset for evaluation, the mean average precision (m AP) at 0.5 Io U threshold is used to measure the performance. We ensure consistency between data partitioning methods and CIOD (Dong et al. 2023) for VOC 2007. COCO 2017 has 80K images in the training set and 40K images in the validation set for 80 object classes, we use the train set for training and the minival set for testing, and the standard COCO protocols are used as the evaluation metrics, i.e., AP, AP50, AP75, APS, APM, and APL. We ensure consistency between data partitioning methods and ERD (Feng, Wang, and Yuan 2022) for COCO 2017. |
| Hardware Specification | Yes | All the experiments are performed on 8 NVIDIA Tesla V100 GPUs, with a batch size of 16, we use the ADAMW as the optimizer with the learning rate of the language backbone is 5 10-6 and other parts are 5 10-5. |
| Software Dependencies | No | The paper mentions using GLIP, Swin-Tiny, FPN, BERT, and ADAMW, but does not provide specific version numbers for these software components or other libraries (e.g., Python, PyTorch versions) required for reproduction. |
| Experiment Setup | Yes | All the experiments are performed on 8 NVIDIA Tesla V100 GPUs, with a batch size of 16, we use the ADAMW as the optimizer with the learning rate of the language backbone is 5 10-6 and other parts are 5 10-5. |