reproducibilityindex.ai

Scale Invariant Fully Convolutional Network: Detecting Hands Efficiently

Authors: Dan Liu, Dawei Du, Libo Zhang, Tiejian Luo, Yanjun Wu, Feiyue Huang, Siwei Lyu4344-4351

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on VIVA and Oxford datasets show that our method achieves competitive performance with the state-of-the-art hand detection methods but with much improved running time efﬁciency.
Researcher Affiliation	Collaboration	1University of the Chinese Academy of Sciences, China 2University at Albany, SUNY, USA 3Institute of Software Chinese Academy of Sciences, China 4Tencent Youtu Lab, China
Pseudocode	No	The paper describes its methods and architecture but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of the proposed method is available at http: //39.107.81.62/Diana/sifcn.
Open Datasets	Yes	VIVA Hand Detection Dataset is used in the Vision for Intelligent Vehicles and Applications Challenge (Das, Ohn Bar, and Trivedi 2015) and Oxford Hand Detection Dataset (Mittal, Zisserman, and Torr 2011) with a link to its official evaluation tool: http://www.robots.ox.ac.uk/ vgg/data/hands/index.html
Dataset Splits	Yes	Oxford Hand Detection Dataset consists of three parts: the training set, the validation set and the testing set, with 1, 844, 406 and 436 images separately.
Hardware Specification	Yes	The experiments are conducted on a single Ge Force GTX 1080 GPU and an Intel(R) Core(TM) i7-6700K @ 4.00GHz CPU.
Software Dependencies	No	The paper mentions optimizers (ADAM) and backbone networks (VGG16, ResNet50) but does not provide specific version numbers for software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA.
Experiment Setup	Yes	Training is implemented with stochastic gradient algorithm using the ADAM scheme. We take the exponential decay learning rate, the initial value of which is 0.0001 and decays every 10, 000 iterations with rate 0.94. ws, s {1, 2, 3, 4} are all set to 1. The hyper-parameters α, β are set to 0.01 and 20, respectively. Besides, the score map threshold is set to 0.8 and the NMS is conducted with a threshold 0.2. For data augmentation, we randomly mirror and crop the images, as well as do color jittering by distorting the hue, saturation and brightness. Due to the limitation of the GPU capacity, the batch size is set as 12 and all the images are resized to 512 512 before fed into the network.