MicroAST: Towards Super-fast Ultra-Resolution Arbitrary Style Transfer

Authors: Zhizhong Wang, Lei Zhao, Zhiwen Zuo, Ailin Li, Haibo Chen, Wei Xing, Dongming Lu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments have been conducted to demonstrate the effectiveness of our method.
Researcher Affiliation Academia College of Computer Science and Technology, Zhejiang University {endywon, cszhl, zzwcs, liailin, cshbchen, wxing, ldm}@zju.edu.cn
Pseudocode No The paper describes the proposed method in text and with diagrams, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper provides a link to 'https://github.com/EndyWon/MicroAST/releases/download/v1.0.0/MicroAST_SM.pdf' which is the supplementary material PDF, not a direct link to the source code repository for the methodology. There is no explicit statement about code release.
Open Datasets Yes We train our Micro AST using MS-COCO (Lin et al. 2014) as content images and Wiki Art (Phillips and Mackintosh 2011) as style images.
Dataset Splits No The paper mentions data augmentation for training: 'During training, all images are loaded with the smaller dimension rescaled to 512 pixels while preserving the aspect ratio, and then randomly cropped to 256 256 pixels for augmentation.' However, it does not explicitly provide details about a validation dataset split or how validation was performed.
Hardware Specification Yes GFLOPs and Time are measured when the content and style are both 4K images and tested on an NVIDIA RTX 2080 (8GB) GPU.
Software Dependencies No The paper mentions 'Py Torch' and other components like VGG-19, MobileNet, and Adam optimizer but does not specify their version numbers, which are required for reproducible software dependencies.
Experiment Setup Yes The loss weights in Eq. (1) are set to λc = 1, λs = 3, and λssc = 3. We use the Adam optimizer (Kingma and Ba 2015) with a learning rate of 0.0001 and a mini-batch size of 8 content-style image pairs. During training, all images are loaded with the smaller dimension rescaled to 512 pixels while preserving the aspect ratio, and then randomly cropped to 256 256 pixels for augmentation.