AQT: Adversarial Query Transformers for Domain Adaptive Object Detection

Authors: Wei-Jie Huang, Yu-Lin Lu, Shih-Yao Lin, Yusheng Xie, Yen-Yu Lin

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Thorough experiments over several domain adaptive object detection benchmarks demonstrate that our approach performs favorably against the state-of-the-art methods.
Researcher Affiliation Collaboration Wei-Jie Huang1 , Yu-Lin Lu1 , Shih-Yao Lin2 , Yusheng Xie3 and Yen-Yu Lin1,4 1National Yang Ming Chiao Tung University 2Sony Corporation of America 3Amazon 4Academia Sinica
Pseudocode No The paper describes the proposed method using descriptive text and mathematical equations, but it does not include a formal pseudocode block or algorithm.
Open Source Code Yes Source code is available at https: //github.com/weii41392/AQT.
Open Datasets Yes Cityscapes [Cordts et al., 2016] is an urban scene dataset containing 2,975 training images and 500 validation images. Foggy Cityscapes [Sakaridis et al., 2018] is synthesized from and shared annotations with Cityscapes. BDD100k [Yu et al., 2020] is a large-scale driving dataset with diverse scenarios. Sim10k [Johnson-Roberson et al., 2017] is a synthetic driving dataset containing 10,000 images.
Dataset Splits Yes Cityscapes [Cordts et al., 2016] is an urban scene dataset containing 2,975 training images and 500 validation images.
Hardware Specification No The paper mentions that experiments were conducted but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using 'Deformable DETR [Zhu et al., 2021] as our object detector with a Res Net-50 backbone pre-trained on Image Net [Deng et al., 2009]' but does not provide specific version numbers for software libraries or dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes We inherit most hyperparameters and training settings from Zhu et al., including the detection loss Ldet and Xavier initialization [Glorot and Bengio, 2010]. In Cityscapes to Foggy Cityscapes, all λsp, λch, and λins are set to 10 1. In the other settings, following [Saito et al., 2019], we adopt local alignment on the backbone and weak alignment using the focal loss [Lin et al., 2017]. The λsp, λch and λins are set to 10 1, 10 5, and 10 4, respectively. The batch size is set to 8 in all experiments.