reproducibilityindex.ai

Supervision Interpolation via LossMix: Generalizing Mixup for Object Detection and Beyond

Authors: Thanh Vu, Baochen Sun, Bodi Yuan, Alex Ngai, Yueqi Li, Jan-Michael Frahm

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that Loss Mix can consistently outperform state-of-the-art methods widely adopted for detection.
Researcher Affiliation	Collaboration	1University of North Carolina at Chapel Hill 2Mineral {thanhvu,baochens,bodiyuan}@mineral.ai, {alexander.s.ngai,yueqili.innovation}@gmail.com, jmf@cs.unc.edu
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper mentions leveraging the 'open-source PyTorch-based Detectron2 (Wu et al. 2019) repository as our object detection codebase' and 'leverage the open source code of the state-of-the-art Adaptive Teacher (Li et al. 2022b) framework', but does not provide a link or explicit statement about the availability of their own Loss Mix implementation code.
Open Datasets	Yes	We conduct experiments on two standard benchmark datasets in object detection, namely PASCAL VOC (Everingham et al. 2010) and MS COCO (Lin et al. 2014). We follow (Zhang et al. 2019) and use the combination of PASCAL VOC 2007 trainval (5k images) and 2012 trainval (12k images) for training. MS COCO (Lin et al. 2014) is composed of 80 object categories and is 10 times larger than PASCAL VOC. We conduct our experiments for cross-domain object detection using two popular and challenging real-to-artistic adaptation setups (Chen et al. 2020; Deng et al. 2021; Kim et al. 2019; Li et al. 2022b; Saito et al. 2019; Shen et al. 2019; Xu et al. 2020): PASCAL VOC (Everingham et al. 2010) Clipart1k (Inoue et al. 2018) and PASCAL VOC (Everingham et al. 2010) Watercolor2k (Inoue et al. 2018).
Dataset Splits	Yes	We follow (Zhang et al. 2019) and use the combination of PASCAL VOC 2007 trainval (5k images) and 2012 trainval (12k images) for training. Together they make up 16,551 images of 20 categories of common, real-world objects, each with fully annotated bounding boxes and class labels. The evaluation is done on PASCAL VOC 2007 test set (5K images). MS COCO (Lin et al. 2014) is composed of 80 object categories and is 10 times larger than PASCAL VOC. We train on train2017 (118K images) and evaluated on val2017 (5K images). Clipart1k dataset shares the same set of object categories as PASCAL VOC and contains a total of 1000 images. We split these into 500 training and 500 test examples. Watercolor2k dataset, which has 2000 images from 6 classes in common with the PASCAL VOC, are split into 1000 training and 1000 test images.
Hardware Specification	Yes	All experiments were trained with 8 NVIDIA GPUs, either V100 or A100.
Software Dependencies	No	The paper mentions using 'PyTorch-based Detectron2' but does not specify version numbers for these software components or any other libraries.
Experiment Setup	Yes	Unless otherwise speciﬁed, we use a batch size of 64 for faster convergence, an initial learning rate of 0.08, and the default step scheduler from Detectron2. We train PASCAL VOC for 18K iterations, which is about 70 epochs, and MS COCO for 270K iterations, or roughly 146.4 epochs.