reproducibilityindex.ai

Object Instance Mining for Weakly Supervised Object Detection

Authors: Chenhao Lin, Siwen Wang, Dongqi Xu, Yu Lu, Wayne Zhang11482-11489

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on two publicly available databases, VOC 2007 and 2012, demonstrate the efﬁcacy of proposed approach.
Researcher Affiliation	Collaboration	Chenhao Lin,1 Siwen Wang,2 Dongqi Xu,1 Yu Lu,1 Wayne Zhang1 1Sense Time Research 2Dalian University of Technology, Dalian, China, 116024
Pseudocode	Yes	Algorithm 1: Object Instance Mining
Open Source Code	Yes	https://github.com/bigvideoresearch/OIM
Open Datasets	Yes	Following the previous state-of-the-art methods on WSOD, we also evaluate our approach two datasets, PASCAL VOC2007(Everingham et al. 2010) and VOC2012(Everingham et al. 2015), which both contain 20 object categories.
Dataset Splits	Yes	For VOC2007, we train the model on the trainval set (5,011 images) and evaluate the performance on the test set (4,952 images). For VOC2012, the trainval set (11,540 images) and the test set (10,991 images) are used for training and evaluation respectively. Additionally, we train our model on the VOC2012 train set (5,717 images) and proceed evaluation on the val set (5,823 images) to further validate the effectiveness of proposed approach.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'VGG16 model pre-trained on the Image Net dataset' but does not specify version numbers for any software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow, Caffe) or programming languages.
Experiment Setup	Yes	The batch size is set to 2, and the learning rates are set to 0.001 and 0.0001 for the ﬁrst 40K and the following 50K iterations respectively. During training and test, we take ﬁve image scales {480, 576, 688, 864, 1200} along with random horizontal ﬂipping for data augmentation. Following (Tang et al. 2017), the threshold T is set to 0.5. With the increased number of iterations, the network has more stable learning ability, we dynamically set the hyper parameters α as α1 = 5 for the ﬁrst 70K and α2 = 2 for the following 20K iterations. β are empirically set to 0.2 in our experiments.