User-Controllable Arbitrary Style Transfer via Entropy Regularization
Authors: Jiaxin Cheng, Yue Wu, Ayush Jaiswal, Xu Zhang, Pradeep Natarajan, Prem Natarajan
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate the superiority of the proposed solution, with speed and stylization quality comparable to or better than existing AST and significantly more diverse than previous DAST works. Code is available at https://github.com/cplusx/eps-Assign-and-Mix. |
| Researcher Affiliation | Collaboration | Jiaxin Cheng1,*, Yue Wu2, Ayush Jaiswal2, Xu Zhang2, Pradeep Natarajan2, Prem Natarajan2 1 USC Information Sciences Institute 2 Amazon Alexa Natural Understanding chengjia@isi.edu; {wuayue,ayujaisw,xzhnamz,natarap,premknat}@amazon.com |
| Pseudocode | No | The paper describes algorithms (e.g., Sinkhorn-Knopp algorithm) and refers to supplemental material for details, but it does not provide any pseudocode or algorithm blocks within the main text. |
| Open Source Code | Yes | Code is available at https://github.com/cplusx/eps-Assign-and-Mix. |
| Open Datasets | Yes | The MS-COCO (Lin et al. 2014) and the Painter-By Numbers (Nichol 2016) (PBN) datasets are used for content and style images, respectively. |
| Dataset Splits | No | The paper states that it trains for '160,000 iterations' and uses 'content and style losses' but does not explicitly provide specific train/validation/test dataset splits, percentages, or sample counts needed to reproduce the data partitioning. |
| Hardware Specification | Yes | We report inference time based on the average of 100 inference runs of 256 256 images on an NVIDIA Titan X GPU. |
| Software Dependencies | No | The paper mentions using an 'Adam optimizer' and 'VGG-19' for the encoder, but it does not specify version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA) that would be needed for software reproducibility. |
| Experiment Setup | Yes | We randomly sample ε from [1e-4, 1], and (content, style) pairs. We resize images to 256 256, and train models for 160,000 iterations. We tune the entire network end-to-end with the Adam optimizer using 1e-4 as the learning rate and 1e-5 as the weight decay. |