Dynamic Instance Normalization for Arbitrary Style Transfer
Authors: Yongcheng Jing, Xiao Liu, Yukang Ding, Xinchao Wang, Errui Ding, Mingli Song, Shilei Wen4369-4376
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that the proposed approach yields very encouraging results on challenging style patterns and, to our best knowledge, for the first time enables an arbitrary style transfer using Mobile Net-based lightweight architecture, leading to a reduction factor of more than twenty in computational cost as compared to existing approaches. |
| Researcher Affiliation | Collaboration | Yongcheng Jing,1 Xiao Liu,2 Yukang Ding,2 Xinchao Wang,3 Errui Ding,2 Mingli Song,1 Shilei Wen2 1Zhejiang University, 2Department of Computer Vision Technology (VIS), Baidu Inc., 3Stevens Institute of Technology |
| Pseudocode | No | The paper includes network diagrams and mathematical formulas but no explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Our network is trained on 82, 783 content images from Microsoft COCO dataset (Lin et al. 2014), and 79, 433 style images from Wiki Art (Nichol 2016). |
| Dataset Splits | No | The paper mentions training and testing datasets but does not specify details for a validation split (percentages, sample counts, or explicit validation set name). |
| Hardware Specification | Yes | The training takes roughly one day on an NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions using specific models and optimizers like Adam and VGG-19, but does not provide version numbers for general software dependencies such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | Yes | The content loss is computed at layer {relu4 1}, while the style loss is computed at layer {relu1 1, relu2 1, relu3 1, relu4 1} of the VGG network. During training, we adopt the Adam optimizer (Kingma and Ba 2015). The learning rates for both the image encoder and decoder are set to 0.0001. The weight and bias networks in DIN layers are set to have a 10 learning rate for faster convergence. |