A Sketch-Transformer Network for Face Photo-Sketch Synthesis

Authors: Mingrui Zhu, Changcheng Liang, Nannan Wang, Xiaoyu Wang, Zhifeng Li, Xinbo Gao

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that the proposed method achieves significant improvements over state-of-the-art approaches on both quantitative and qualitative evaluations. In this section, we first discuss the experimental settings. We will then conduct ablation study to quantify the contribution of different configurations to overall effectiveness. Finally, we will compare our results with state-of-the-art methods both qualitatively and quantitatively.
Researcher Affiliation Collaboration Mingrui Zhu1 , Changcheng Liang1 , Nannan Wang1 , Xiaoyu Wang2 , Zhifeng Li3 and Xinbo Gao4 1State Key Laboratory of Integrated Services Networks, Xidian University, Xi an, China 2The Chinese University of Hong Kong (Shenzhen), Shenzhen, China 3Tencent, Shenzhen, China 4Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
Pseudocode No The paper describes the model architecture and components through text and diagrams, but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not explicitly state that the source code for the proposed method is publicly available or provide a link to it.
Open Datasets Yes The experiments are conducted on two public databases: the CUFS database [Tang and Wang, 2009] and the CUFSF dataset [Zhang et al., 2011b].
Dataset Splits Yes The way we divide the training set and the test set is the same as [Zhu et al., 2017b]. For the CUFS database, 88 face photo-sketch pairs in CUHK database, 80 face photo-sketch pairs in AR database and 100 face photo-sketch pairs in XM2VTS database are selected for training and the rest are used for testing. For the CUFSF database, 250 face photo-sketch pairs are selected for training and the rest are used for testing. Table 1 shows the partition settings of the databases.
Hardware Specification Yes All models are trained on a NVIDIA Tesla V100 GPU using Adam optimizer with β1 = 0.5 and β2 = 0.99.
Software Dependencies No The paper mentions using 'Adam optimizer' and a 'pre-trained VGG-19 model' but does not specify version numbers for any software, libraries, or frameworks used in implementation.
Experiment Setup Yes All models are trained on a NVIDIA Tesla V100 GPU using Adam optimizer with β1 = 0.5 and β2 = 0.99. We train all models with a fixed learning rate of 0.0002 until 300,000 iterations. The batch size is set to 1 for all experiments. Weights were initialized from a Gaussian distribution with mean 0 and standard deviation 0.02. We scaled the size of the input images to 256 x 256 and normalized the pixel value to the interval [-1, 1] before putting them into the model. During training, we updated G and D alternatively at every iteration.