reproducibilityindex.ai

Recognize Any Regions

Authors: Haosen Yang, Chuofan Ma, Bin Wen, Yi Jiang, Zehuan Yuan, Xiatian Zhu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments in open-world object recognition show that our Region Spot achieves significant performance gain over prior alternatives
Researcher Affiliation	Collaboration	Haosen Yang1 Chuofan Ma2 Bin Wen3 Yi Jiang3 Zehuan Yuan3 Xiatian Zhu1 1University of Surrey 2The University of Hong Kong 3Byte Dance
Pseudocode	No	The paper describes the model architecture and process flow but does not include an explicit pseudocode block or algorithm.
Open Source Code	No	The code will be available after being accepted.
Open Datasets	Yes	For training, we utilized publicly available detection datasets, comprising a total of approximately 3 million images. These datasets include Objects 365 (O365) [29], Open Images (OI) [15], and V3Det (V3D) [33]
Dataset Splits	Yes	We utilized the extensive LVIS detection dataset [8], which encompasses 1203 categories and 19809 images reserved for validation.
Hardware Specification	Yes	training our model with 3 million data in a single day using 8 V100 GPUs.
Software Dependencies	No	The paper mentions the use of Adam W optimizer, but does not specify programming language versions or specific library versions like PyTorch or TensorFlow.
Experiment Setup	Yes	We train Region Spot using Adam W [13] optimizer with the initial learning rate as 2.5 10 5. All models are trained with a mini-batch size 16 on 8 GPUs. The default training schedule is 450K iterations, with the learning rate divided by 10 at 350K and 420K iterations. The model is trained for 450K iterations at each stage. ... using input image resolution of 336.