Encoding Spatial Distribution of Convolutional Features for Texture Representation

Authors: Yong Xu, Feng Li, Zhile Chen, Jinxiu Liang, Yuhui Quan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We applied FE to Res Net-based texture classification and retrieval, and demonstrated its effectiveness on several benchmark datasets.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, South China University of Technology, China 2 Peng Cheng Laboratory, China yxu@scut.edu.cn, csfengli@mail.scut.edu.cn, cszhilechen@mail.scut.edu.cn, cssherryliang@mail.scut.edu.cn, csyhquan@scut.edu.cn
Pseudocode No The paper describes the components and their functions with mathematical equations and block diagrams (Figure 2), but it does not contain structured pseudocode or algorithm blocks.
Open Source Code No Our FENet is implemented with Py Torch 1.7, and the code will be released at the website: https://github.com/csfengli/FENet.
Open Datasets Yes We apply FENet to texture classification and texture retrieval on six benchmark datasets, including GTOS [2], GTOS-M [9], KTH [37], MINC [38], DTD [39] and FMD [1].
Dataset Splits Yes Same as existing work, we use the provided split schemes on GTOS, MINC and DTD, and random 10 splits on KTH-TIPS2b and FMD with recommended split sizes. The mean and standard deviation of classification accuracies over all splits are calculated. The results are reported using two runs on GTOS-M and five-time statistics on other datasets.
Hardware Specification Yes All the experiments were run on a single Titan XP GPU.
Software Dependencies Yes Our FENet is implemented with Py Torch 1.7
Experiment Setup Yes Based on the empirical parameter setting in fractal analysis in previous studies [21, 35], we set C0 = 3 in GAP, r = 1, 2, ..., 6 in LDEB, K = 16 in PGB, and dt = 2, 3, , 6 in GDCB. The output sizes of the two paths in FE are both set to 48. Following [34, 8], we use Res Net-18 and Res Net-50 as the backbone respectively. On all the datasets, FENet is trained with the cross-entropy loss via the momentum SGD optimizer with default settings and 30 epochs. The batch size is set to 16 on FMD, 32 on KTH, and 64 on the other four datasets. The learning rate is initialized to 1e 3 on FMD, 5e 3 on MINC, GTOS and GTOS-M, and 1e 2 on KTH and DTD datasets, with cosine decay every 10 epochs. The Res Net backbones are initialized with the pre-trained models on Imagenet. The {Gr}r in (5) are initialized as Gaussian kernels with bandwidth 1. Other parameters of FENet are initialized by Xavier [40]. Data augmentation via horizontal flipping and random cropping to 224 224 is applied.