reproducibilityindex.ai

Cloud Object Detector Adaptation by Integrating Different Source Knowledge

Authors: Shuaifeng Li, Mao Ye, Lihua Zhou, Nianxin Li, Siying Xiao, Song Tang, Xiatian Zhu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results demonstrate that the proposed COIN method achieves the state-of-the-art performance.
Researcher Affiliation	Academia	1 University of Electronic Science and Technology of China 2 University of Shanghai for Science and Technology 3 University of Surrey
Pseudocode	Yes	Algorithm 1 Our proposed COIN method.
Open Source Code	Yes	https://github.com/Flashkong/COIN
Open Datasets	Yes	Specifically, we validate the effectiveness of the proposed COIN method on six object detection datasets, e.g., Cityscapes [11], Foggy-Cityscapes [11], Clipart [25], BDD100K [63], KITTI [16] and Sim10K [26].
Dataset Splits	No	Cityscapes [11] consists of 2,975 training images and 500 testing images... Foggy-Cityscapes [11] contains three levels of foggy images simulated by the images of Cityscapes. 2,975 training images and 500 testing images... For comparison with existing methods, we follow [35, 14], and use 36,728 training images and 5,258 testing images with 7 classes for training and testing respectively. KITTI [16] contains 7,481 urban images with the car category. We use all the images for training and testing. Sim10K [26] contains 10K images collected from the computer game Grand Theft Auto V with the car category. All images are used for training and testing.
Hardware Specification	Yes	One 3090 GPU, a batch-size 3 and a random seed 2024 are used for all experiments.
Software Dependencies	No	One 3090 GPU, a batch-size 3 and a random seed 2024 are used for all experiments. SGD [2] is used as the optimizer where the initial learning rate is 0.001 and the weight decay is 0.0001.
Experiment Setup	Yes	The hyperparameters γ1, γ2 and π are set to 0.1, 0.1 and 0.7 by default. The shorter side of the image is resized to 600 during training and testing, and the reported mean average precision (m AP) is based on an Io U threshold of 0.5. For pre-training CLIP detector, we iterate 50K steps. For knowledge distillation, we generally iterate 45K steps using Eq.17, and then iterate 20K steps using Eq.18.