Scale Invariant Fully Convolutional Network: Detecting Hands Efficiently
Authors: Dan Liu, Dawei Du, Libo Zhang, Tiejian Luo, Yanjun Wu, Feiyue Huang, Siwei Lyu4344-4351
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on VIVA and Oxford datasets show that our method achieves competitive performance with the state-of-the-art hand detection methods but with much improved running time efficiency. |
| Researcher Affiliation | Collaboration | 1University of the Chinese Academy of Sciences, China 2University at Albany, SUNY, USA 3Institute of Software Chinese Academy of Sciences, China 4Tencent Youtu Lab, China |
| Pseudocode | No | The paper describes its methods and architecture but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code of the proposed method is available at http: //39.107.81.62/Diana/sifcn. |
| Open Datasets | Yes | VIVA Hand Detection Dataset is used in the Vision for Intelligent Vehicles and Applications Challenge (Das, Ohn Bar, and Trivedi 2015) and Oxford Hand Detection Dataset (Mittal, Zisserman, and Torr 2011) with a link to its official evaluation tool: http://www.robots.ox.ac.uk/ vgg/data/hands/index.html |
| Dataset Splits | Yes | Oxford Hand Detection Dataset consists of three parts: the training set, the validation set and the testing set, with 1, 844, 406 and 436 images separately. |
| Hardware Specification | Yes | The experiments are conducted on a single Ge Force GTX 1080 GPU and an Intel(R) Core(TM) i7-6700K @ 4.00GHz CPU. |
| Software Dependencies | No | The paper mentions optimizers (ADAM) and backbone networks (VGG16, ResNet50) but does not provide specific version numbers for software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA. |
| Experiment Setup | Yes | Training is implemented with stochastic gradient algorithm using the ADAM scheme. We take the exponential decay learning rate, the initial value of which is 0.0001 and decays every 10, 000 iterations with rate 0.94. ws, s {1, 2, 3, 4} are all set to 1. The hyper-parameters α, β are set to 0.01 and 20, respectively. Besides, the score map threshold is set to 0.8 and the NMS is conducted with a threshold 0.2. For data augmentation, we randomly mirror and crop the images, as well as do color jittering by distorting the hue, saturation and brightness. Due to the limitation of the GPU capacity, the batch size is set as 12 and all the images are resized to 512 512 before fed into the network. |