DetKDS: Knowledge Distillation Search for Object Detectors
Authors: Lujun Li, Yufan Bao, Peijie Dong, Chuanguang Yang, Anggeng Li, Wenhan Luo, Qifeng Liu, Wei Xue, Yike Guo
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on different detectors demonstrate that Det KDS outperforms state-of-the-art methods in detection and instance segmentation tasks. Extensive experiments are conducted to elaborate on the effectiveness of Det KDS. |
| Researcher Affiliation | Academia | 1The Hong Kong University of Science and Technology 2The Hong Kong University of Science and Technology (Guang Zhou) 3Institute of Computing Technology, Chinese Academy of Sciences. |
| Pseudocode | Yes | Algorithm 1 Divide-and-conquer Evolution in Det KDS |
| Open Source Code | Yes | Code at: https://github.com/lliai/Det KDS. |
| Open Datasets | Yes | We evaluate our method on the COCO dataset (Lin et al., 2014), which contains 80 object classes. After searching, we train student detectors with the best distiller on the full COCO dataset, including 120K train images. |
| Dataset Splits | Yes | For search settings, we utilize all searches on the subsets of COCO training set (i.e., mini-COCO), which consists of 25K training images and 5K validation images. |
| Hardware Specification | Yes | Following the same training settings as FGD, we develop our experiments using 8 NVIDIA V100 GPUs with a mini-batch of two images per GPU. |
| Software Dependencies | No | The paper mentions using an 'SGD optimizer' but does not specify any software libraries, frameworks, or their version numbers that would be necessary for replication. |
| Experiment Setup | Yes | We configure 20 iterations of parallel searching for individual losses and 40 iterations for combined weights for multiple losses. We set training to one epoch for each search iteration. For EA settings, we set (P, T , r, k) in Alg. 1 as (20, 40, 0.9, 5). After searching, we train student detectors with the best distiller on the full COCO dataset, including 120K train images. Following the same training settings as FGD, we develop our experiments using 8 NVIDIA V100 GPUs with a mini-batch of two images per GPU. We train all the detectors for 24 epochs with SGD optimizer, in which the momentum is 0.9, and the weight decay is 0.0001. |