Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Authors: Baisheng Lai, Xiaojin Gong

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on PASCAL VOC demonstrate that our approach outperforms all state-of-the-arts.
Researcher Affiliation Academia College of Information Science & Electronic Engineering, Zhejiang University, China {laibs,gongxj}@zju.edu.cn
Pseudocode No The paper describes the network architecture and training procedure in text and with diagrams (Figure 1) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any links to open-source code for the described methodology, nor does it state that the code will be made publicly available.
Open Datasets Yes The experiments are conducted on the PASCAL VOC 2007 and 2012 datasets [Everingham et al., 2010], which are the benchmark most widely used in WSOD.
Dataset Splits Yes The VOC 2007 dataset contains 2501 training, 2510 validation, and 4952 test images. VOC 2012 has 5717 training, 5823 validation, and 10991 test images.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. It only mentions "Our approach is implemented using the Mat Conv Net toolbox" without further hardware details.
Software Dependencies No The paper states "Our approach is implemented using the Mat Conv Net toolbox [Vedaldi and Lenc, 2015]" but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup Yes For training, we run 20 epochs, in which the first 10 epochs take a learning rate of 10 5 and the second 10 epochs take 10 6. Each image is randomly flipped and scaled to have maximal width or height of {480, 576, 688, 864, 1200} with respect to the original aspect ratio. The hyper parameters in our network are set empirically as σ = 103, λ1 = 0.1, λ2 = 1 and λ3 = 5 10 4.