reproducibilityindex.ai

Spatial as Deep: Spatial CNN for Traffic Scene Understanding

Authors: Xingang Pan, Jianping Shi, Ping Luo, Xiaogang Wang, Xiaoou Tang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply SCNN on a newly released very challenging trafﬁc lane detection dataset and Cityscapse dataset. The results show that SCNN could learn the spatial relationship for structure output and signiﬁcantly improves the performance. We show that SCNN outperforms the recurrent neural network (RNN) based Re Net and MRF+CNN (MRFNet) in the lane detection dataset by 8.7% and 4.6% respectively.
Researcher Affiliation	Collaboration	1The Chinese University of Hong Kong 2Sense Time Group Limited
Pseudocode	No	The paper contains mathematical equations and diagrams, but no structured pseudocode or algorithm blocks are explicitly labeled or formatted as such.
Open Source Code	Yes	Code is available at https://github.com/Xingang Pan/SCNN
Open Datasets	Yes	In this paper, we present a large scale challenging dataset for trafﬁc lane detection. To collect data, we mounted cameras on six different vehicles driven by different drivers and recorded videos during driving in Beijing on different days. More than 55 hours of videos were collected and 133,235 frames were extracted, which is more than 20 times of Tu Simple Dataset. We have divided the dataset into 88880 for training set, 9675 for validation set, and 34680 for test set. It also mentions: 'the recently released Tu Simple Benchmark Dataset (Tu Simple 2017) consists of 1224 and 6408 images with annotated lane markings respectively'.
Dataset Splits	Yes	We have divided the dataset into 88880 for training set, 9675 for validation set, and 34680 for test set.
Hardware Specification	No	Table 6 mentions 'Device CPU2 GPU3 GPU CPU' but does not specify exact models (e.g., NVIDIA A100, Intel Xeon E5) or detailed specifications (e.g., amount of RAM, clock speed) for the hardware used in experiments, which is required for reproducibility.
Software Dependencies	Yes	All experiments are implemented on the Torch7 (Collobert, Kavukcuoglu, and Farabet 2011) framework.
Experiment Setup	Yes	In both tasks, we train the models using standard SGD with batch size 12, base learning rate 0.01, momentum 0.9, and weight decay 0.0001. The learning rate policy is poly with power and iteration number set to 0.9 and 60K respectively. The initial weights of the ﬁrst 13 convolution layers are copied from VGG16 (Simonyan and Zisserman 2015) trained on Image Net (Deng et al. 2009). The output channel number of the fc7 layer is set to 128, the rate for the atrous convolution layer of fc6 is set to 4, batch normalization (Ioffe and Szegedy 2015) is added before each Re LU layer. During training, the line width of the targets is set to 16 pixels, and the input and target images are rescaled to 800 288. The loss of background is multiplied by 0.4.