Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization
Authors: Zhe Chen, Wenhai Wang, Enze Xie, Tong Lu, Ping Luo393-400
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our URST surpasses existing SOTA methods on ultra-high resolution images benefiting from the effectiveness of the proposed stroke perceptual loss in enlarging the stroke size. To verify the versatility of our URST, we apply it to 6 representative style transfer methods, including Johnson et al. (Johnson, Alahi, and Fei-Fei 2016), MSG-Net (Zhang and Dana 2018), Ada IN (Huang and Belongie 2017), WCT (Li et al. 2017b), Linear WCT (Li et al. 2019), and Wang et al. (Wang et al. 2020). Ablation Study Thumbnail Instance Normalization. As discussed, consistent normalization statistics are important for patch-wise style transfer. To verify this, we conduct experiments of patch-wise style transfer with IN and the proposed TIN, respectively. |
| Researcher Affiliation | Academia | Zhe Chen1, Wenhai Wang2*, Enze Xie3, Tong Lu1*, Ping Luo3 1State Key Lab for Novel Software Technology, Nanjing University 2Shanghai Artificial Intelligence Laboratory 3The University of Hong Kong |
| Pseudocode | No | The paper describes the proposed method using textual descriptions and diagrams, but it does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://git.io/URST. |
| Open Datasets | Yes | Following common practices (Chen and Schmidt 2016; Deng et al. 2020; Li et al. 2019), we use MS-COCO dataset (Lin et al. 2014) as content images and Wiki Art dataset (Nichol 2016) as style images, both of which contain roughly 80,000 training samples. |
| Dataset Splits | No | The paper mentions that MS-COCO and Wiki Art datasets contain "roughly 80,000 training samples" but does not specify how the data was split into training, validation, and test sets, or provide percentages/counts for these splits. |
| Hardware Specification | Yes | All models are trained with a batch size of 8 on a Titan XP GPU... |
| Software Dependencies | No | The paper mentions using a VGG19 network and applying URST to various style transfer methods (e.g., Ada IN, WCT), but it does not specify software versions for libraries, frameworks (like PyTorch or TensorFlow), or programming languages. |
| Experiment Setup | Yes | All models are trained with a batch size of 8 on a Titan XP GPU, and other training set-tings are the same as the original settings in the selected style transfer methods (Huang and Belongie 2017; Li et al. 2019). In our experiments, λ is set to 1.0 by default. During training, our stroke perceptual loss is computed at the relu4 1 layer of the VGG network. We use a 1064 x 1064 pixels sliding window with a stride of 1000 to divide the input image, and the style image used in our framework is 1024 x 1024 pixels. |