Domain Adaptation for Large-Vocabulary Object Detectors
Authors: Kai Jiang, Jiaxing Huang, Weiying Xie, Jie Lei, Yunsong Li, Ling Shao, Shijian Lu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments over multiple widely adopted detection benchmarks show that KGD outperforms the state-of-the-art consistently by large margins. This section presents experimental results. Dataset details and implementation details can be found in the Appendix. Section 4.1 presents the experiments across various downstream domain datasets. Section 4.2 and Section 4.3 provide ablation studies and discuss different features of KGD. |
| Researcher Affiliation | Collaboration | Kai Jiang1, , Jiaxing Huang2, , Weiying Xie1, Jie Lei3, Yunsong Li1, Ling Shao4, Shijian Lu2, 1State Key Laboratory of Integrated Services Networks, Xidian University, Xi an 710071, China 2S-lab, School of Computer Science and Engineering, Nanyang Technological University 3School of Electrical and Data Engineering at the University of Technology Sydney 4UCAS-Terminus AI Lab, University of Chinese Academy of Sciences, China |
| Pseudocode | Yes | A.3 Algorithm of KGD We describe the detailed algorithm of our proposed KGD in Algorithm 1. |
| Open Source Code | No | The code will be released upon acceptance. |
| Open Datasets | Yes | We perform experiments on 11 object detection datasets that span different downstream domains including the object detection for autonomous driving [73, 74], autonomous driving under different weather and time-of-day conditions [75], intelligent surveillance [80, 81, 82], common objects [78, 79], and artistic illustration [83]. |
| Dataset Splits | Yes | Cityscapes [73] is a dataset designed for the purpose of understanding street scenes. It comprises images captured in 50 different cities, encompassing a total of 2975 training images and 500 validation images. |
| Hardware Specification | Yes | The experiments are conducted on one RTX 2080Ti. |
| Software Dependencies | No | The paper mentions 'Adam W [90] optimizer' and implies a deep learning framework through model architectures like 'Center Net2' and 'Swin-B', but does not provide specific version numbers for software dependencies such as PyTorch, CUDA, or other libraries. |
| Experiment Setup | Yes | We use Adam W [90] optimizer with initial learning rate 5 10 6 and weight decay 10 4, and adopt a cosine learning rate schedule without warm-up iterations. The batch size is 2 and the image s shorter side is set to 640 while maintaining the aspect ratio unchanged. The pseudo labels generated by the teacher detector with confidence greater than the threshold τ = 0.25 are selected for adaptation. |