reproducibilityindex.ai

Crowd Counting using Deep Recurrent Spatial-Aware Network

Authors: Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four challenging benchmarks show the effectiveness of our approach. Speciﬁcally, comparing with the existing best-performing methods, we achieve an improvement of 12% on the largest dataset World Expo 10 and 22.8% on the most challenging dataset UCF CC 50.
Researcher Affiliation	Academia	1 School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China 2 School of Electrical and Information Engineering, The University of Sydney, Sydney, Australia
Pseudocode	No	The paper provides architectural diagrams for its networks (Figure 2 and Figure 3) but does not include any formal pseudocode blocks or algorithms.
Open Source Code	No	The paper does not include any statements or links indicating that the source code for their methodology is open-sourced or publicly available.
Open Datasets	Yes	Shanghai Tech [Zhang et al., 2016]. This dataset contains 1,198 images of unconstrained scenes with a total of 330,165 annotated people. UCF CC 50 [Idrees et al., 2013]. As an extremely challenging benchmark, this dataset contains 50 annotated images of diverse scenes collected from the Internet. MALL [Chen et al., 2012]. This dataset was captured by a publicly accessible surveillance camera in a shopping mall with more challenging lighting conditions and glass surface reﬂections. World Expo 10 [Zhang et al., 2015]. This dataset contains 1,132 video sequences captured by 108 surveillance cameras during the Shanghai World Expo in 2010.
Dataset Splits	Yes	Following the standard protocol discussed in [Idrees et al., 2013], we split the dataset into ﬁve subsets and perform a ﬁve-fold crossvalidation. When training, we randomly crop some regions with a range of [0.5, 0.9] from the original images and resize them to 1024 768. The testing images are directly resized to the same resolution. ... Following the same setting as [Chen et al., 2012], we use the ﬁrst 800 frames for training and the remaining 1,200 frames for evaluation. ... The training set consists of 3,380 annotated frames from 103 scenes, while the testing images are extracted from other ﬁve different scenes with 120 frames per scene.
Hardware Specification	No	The paper does not explicitly specify the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications). It only mentions using 'TensorFlow' but not the underlying hardware.
Software Dependencies	Yes	We adopt the Tensor Flow [Abadi et al., 2016] toolbox to implement our crowd counting network. ... We optimize our networks parameters with Adam optimization [Kingma and Ba, 2014] by minimizing the loss function Eq.(8).
Experiment Setup	Yes	The ﬁlter weights of all convolutional layers and fullyconnected layers are initialized by truncated normal distribution with a deviation equal to 0.01. The learning rate is set to 10 4 initially and multiplied by 0.98 every 1K training iterations. The batch size is set to 1. We optimize our networks parameters with Adam optimization [Kingma and Ba, 2014] by minimizing the loss function Eq.(8).