Learning Domain-Aware Detection Head with Prompt Tuning

Authors: Haochen Li, Rui Zhang, Hantao Yao, Xinkai Song, Yifan Hao, Yongwei Zhao, Ling Li, Yunji Chen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments over multiple cross-domain adaptation tasks demonstrate that using the domain-adaptive prompt can produce an effectively domain-related detection head for boosting domain-adaptive object detection. Our code is available at https://github.com/Therock90421/DA-Pro. ... We conduct extensive experiments for the proposed DA-Pro on three mainstream benchmarks: Cross-Weather (Cityscapes Foggy Cityscapes), Cross-Fov (KITTI Cityscapes), and Sim-to Real (SIM10K Cityscapes). The experimental results show that our method brings noticeable improvement and achieves state-of-the-art performance. Concretely, DA-Pro improves the m AP by 1.9% 3.3% on synthetic and real datasets over the strong Baseline Region CLIP. In the best case, we achieve 55.9% m AP on the widely accepted benchmark of Cross-Weather, showing remarkable effectiveness in applying the domain adaptive detection head.
Researcher Affiliation Academia 1Intelligent Software Research Center, Institute of Software, CAS, Beijing, China 2State Key Lab of Processors, Institute of Computing Technology, CAS, Beijing, China 3 State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China 4 University of Chinese Academy of Sciences, Beijing, China
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Therock90421/DA-Pro.
Open Datasets Yes Cross-Weather Cityscapes [4] is a large-scale dataset... Foggy Cityscapes [32] is a synthetic foggy dataset... Cross-Fov KITTI [10] is a vital dataset... Sim-to-Real SIM10k [18] is a synthetic dataset... Pascal VOC [7] is a large-scale real-world dataset... Clipart [15] is collected from the website... Watercolor and Comic [15] both contain 1000 training images and 1000 test images in art style, sharing 6 categories with Pascal VOC.
Dataset Splits Yes Cityscapes [4] is a large-scale dataset that contains diverse images recorded in street scenes. It is divided into 2,975 training and 500 validation images, annotated with 8 classes. Foggy Cityscapes [32] is a synthetic foggy dataset... containing 8,925 training images and 1,500 validation images. We take the training set of Cityscapes as the source domain and the training set of foggy Cityscapes as the target domain, evaluating Cross-Weather adaptation performance on the 1500-sized validation set in all 8 categories.
Hardware Specification Yes All experiments are deployed on a Tesla V100 GPU.
Software Dependencies No The paper mentions using specific models like ResNet-50 and Transformer and initializing with CLIP, but it does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, or scikit-learn).
Experiment Setup Yes We fix the length of learnable tokens M, N to 8, 8, respectively. The hyperparameter λ is set to 1.0. We set the batch size of each domain to 2 and use the SGD optimizer with a warm-up learning rate for training.