reproducibilityindex.ai

Long-tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction

Authors: Chen-Long Duan, Yong Li, Xiu-Shen Wei, Lin Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on COCO and LVIS v1.0 datasets demonstrate the effectiveness of our method, particularly in improving the m AP/AP scores for tail classes. To evaluate the effectiveness of our method, we conduct extensive experiments on two benchmark datasets, i.e., COCO [35] and LVIS v1.0 [13]. Experiments on these datasets from both quantitative and qualitative perspectives validate the effectiveness of our proposed method.
Researcher Affiliation	Academia	1Nanjing University of Science and Technology 2School of Computer Science and Engineering, and Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Southeast University
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The authors state: 'We will continue to conduct further research based on this work. However, we can consider releasing the main checkpoints for public use.' This indicates a future possibility rather than immediate open-source availability.
Open Datasets	Yes	We conduct experiments on two representative datasets: COCO [35] and LVIS v1.0 [13]. Microsoft COCO: Common objects in context. LVIS: A dataset for large vocabulary instance segmentation.
Dataset Splits	Yes	The COCO dataset...comprising 80 classes with a relatively balanced distribution, including 118k training images and 5k validation images. LVIS features 1,203 classes with a highly imbalanced distribution, containing 100k training images and 19.8k validation images.
Hardware Specification	Yes	We pre-train the models on 8 RTX3090 GPUs with a batch size of 16. The models are trained with a total batch size of 16 on 8 GPUs (RTX3090 with 24 GB VRAM).
Software Dependencies	No	All models are implemented using the MMDetection toolbox [5]. We employ MMDetection [5] as our detection framework to conduct our experiment. Py Torch: An imperative style, high-performance deep learning library. The paper mentions software tools like MMDetection and PyTorch but does not specify their version numbers, which are crucial for reproducibility.
Experiment Setup	Yes	Unless otherwise specified, pre-training follows the 1 schedule (12 epochs), starting with an initial learning rate of 0.02, which is reduced by a factor of 10 after the 8th and 11th epochs. For 2 schedule, models are trained with 24 epochs, and the learning rate decays at the end of epoch 16 and 22. In our experiments, the hyper-parameters are set as follows: αc is set to 0.1, βc is set to 0.05, αr is set to 0.1. We trained the models using SGD with 0.9 momentum. The batch size and learning rate are set as 16 and 0.02, and the data augmentation strictly follows previous long-tailed detection methods [24, 47].