Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection

Authors: Haowei Zhu, Tianxiang Pan, Rui Qin, Jun-Hai Yong, Bin Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that Re Con substantially improve the quality and trainability of generated data, achieving consistent performance gains across various datasets, backbone architectures, and data scales. Our code is available at https://github.com/haoweiz23/Re Con.
Researcher Affiliation Collaboration Haowei Zhu1, Tianxiang Pan2, Rui Qin1, Jun-Hai Yong1, Bin Wang 1,3 1Tsinghua University 2Li Auto Inc. 3 BNRist EMAIL
Pseudocode Yes We present the pseudo algorithm of our Region-Guided Rectification Sampling in Algorithm 1. In Stage 1, we use a pre-trained perception model to identify instance masks. If annotated masks are already available, this step can be skipped.
Open Source Code Yes Extensive experiments demonstrate that Re Con substantially improve the quality and trainability of generated data, achieving consistent performance gains across various datasets, backbone architectures, and data scales. Our code is available at https://github.com/haoweiz23/Re Con.
Open Datasets Yes Extensive experiments demonstrate that Re Con substantially improve the quality and trainability of generated data, achieving consistent performance gains across various datasets, backbone architectures, and data scales. Our method achieves superior performance on the COCO dataset compared to models that have been specifically fine-tuned on COCO. To validate the generalization capability of our method, we conducted additional experiments on PASCAL VOC datasets.
Dataset Splits Yes To evaluate our approach under such conditions, we conduct experiments in three data-scarce regimes by randomly sampling 1%, 5%, and 10% of the COCO training set and then doubling each subset through augmentation. For VOC benchmark, we combine the training sets of VOC 2007 and VOC 2012 for model training, with evaluation performed on the VOC 2007 test set (4,952 images).
Hardware Specification Yes All runtime measurements were collected on a single NVIDIA RTX 3090 GPU.
Software Dependencies Yes We use Stable Diffusion v1.5 (Rombach et al., 2022) with a 25-step DDIM sampler (Song et al., 2020) and edge-conditioned Control Net (Zhang et al., 2023) to generate training images. We implement training and evaluation code based on the MMDetection framework (Chen et al., 2019).
Experiment Setup Yes For consistency with prior work (Wang et al., 2024b), our default detector is Faster R-CNN (Ren et al., 2015) with an R-50FPN backbone trained for six epochs. We implement training and evaluation code based on the MMDetection framework (Chen et al., 2019). For Faster R-CNN, ATSS, FCOS, and Retina Net, we follow the standard 1 training schedule, running 12 epochs for all experiments, except for Faster R-CNN on the COCO dataset, where the training is reduced to 6 epochs. We use random flipping as the default data augmentation strategy. For YOLOX-S, we follow the official training setup with 300 epochs and apply a stronger augmentation pipeline, including mosaic, random affine transformations, mixup, random flipping, and HSV-based random augmentation. For DEIM, we follow the official training configuration to train model for 40 epochs.