MicroAST: Towards Super-fast Ultra-Resolution Arbitrary Style Transfer
Authors: Zhizhong Wang, Lei Zhao, Zhiwen Zuo, Ailin Li, Haibo Chen, Wei Xing, Dongming Lu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments have been conducted to demonstrate the effectiveness of our method. |
| Researcher Affiliation | Academia | College of Computer Science and Technology, Zhejiang University {endywon, cszhl, zzwcs, liailin, cshbchen, wxing, ldm}@zju.edu.cn |
| Pseudocode | No | The paper describes the proposed method in text and with diagrams, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides a link to 'https://github.com/EndyWon/MicroAST/releases/download/v1.0.0/MicroAST_SM.pdf' which is the supplementary material PDF, not a direct link to the source code repository for the methodology. There is no explicit statement about code release. |
| Open Datasets | Yes | We train our Micro AST using MS-COCO (Lin et al. 2014) as content images and Wiki Art (Phillips and Mackintosh 2011) as style images. |
| Dataset Splits | No | The paper mentions data augmentation for training: 'During training, all images are loaded with the smaller dimension rescaled to 512 pixels while preserving the aspect ratio, and then randomly cropped to 256 256 pixels for augmentation.' However, it does not explicitly provide details about a validation dataset split or how validation was performed. |
| Hardware Specification | Yes | GFLOPs and Time are measured when the content and style are both 4K images and tested on an NVIDIA RTX 2080 (8GB) GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' and other components like VGG-19, MobileNet, and Adam optimizer but does not specify their version numbers, which are required for reproducible software dependencies. |
| Experiment Setup | Yes | The loss weights in Eq. (1) are set to λc = 1, λs = 3, and λssc = 3. We use the Adam optimizer (Kingma and Ba 2015) with a learning rate of 0.0001 and a mini-batch size of 8 content-style image pairs. During training, all images are loaded with the smaller dimension rescaled to 512 pixels while preserving the aspect ratio, and then randomly cropped to 256 256 pixels for augmentation. |