PixelLink: Detecting Scene Text via Instance Segmentation

Authors: Dan Deng, Haifeng Liu, Xuelong Li, Deng Cai

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that, compared with regression-based methods, Pixel Link can achieve better or comparable performance on several benchmarks, while requiring many fewer training iterations and less training data.
Researcher Affiliation Collaboration 1State Key Lab of CAD&CG, College of Computer Science, Zhejiang University 2Alibaba-Zhejiang University Joint Institute of Frontier Technologies 3CVTE Research 4Xi an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states that the algorithm is implemented in Tensorflow 1.1.0 and pure Python, with the code of join operation described in Sec. 3.2 compiled with Cython, but it does not provide concrete access to the source code (e.g., a specific repository link or an explicit code release statement).
Open Datasets Yes ICDAR2015(IC15) Challenge 4 of IC15 (Karatzas et al. 2015)... ICDAR2013(IC13) IC13 (Karatzas et al. 2013)... MSRA-TD500(TD500) Texts in TD500 (Yao et al. 2012)... fine-tuned on TD500-train + HUST-TR400 (Yao, Bai, and Liu 2014)
Dataset Splits No The paper describes training and testing on datasets like IC15-train and IC15-test, but it does not explicitly provide specific details about a dedicated validation dataset split or its size (e.g., 'X% for validation'). While hyperparameters are tuned, the setup for a validation split is not explicitly detailed.
Hardware Specification Yes When trained with a batch size of 24 on 3 GPUs(GTX Titan X), it takes about 0.65s per iteration, and the whole training processing takes about 7~8 hours. 128G RAM and two Intel Xeon CPUs(2.20GHz) are available on the machine where experiments are conducted.
Software Dependencies Yes The whole algorithm is implemented in Tensorflow 1.1.0 and pure Python, with the code of join operation described in Sec. 3.2 compiled with Cython.
Experiment Setup Yes Pixel Link models are optimized by SGD with momentum = 0.9 and weight decay = 5 × 10−4. Learning rate is set to 10−3 for the first 100 iterations, and fixed at 10−2 for the rest. When trained with a batch size of 24... Input images are resized to 512 × 512. ...minimal shorter side length and area are used for postfiltering and set to 10 and 300 respectively...