Stroke-Based Stylization Learning and Rendering with Inverse Reinforcement Learning
Authors: Ning Xie, Tingting Zhao, Feng Tian, Xiao Hua Zhang, Masashi Sugiyama
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments, we demonstrate that our system can successfully learn artists styles and render pictures with consistent and smooth brush strokes. Figure 6 plots the average return over 16 trials as the function of policy update iterations, obtained by the policies learned by our approach. Returns at each trial are computed over 300 training episode samples. Stroke drawing results by an artist, the agent trained with the learned reward function, and the agent trained with the manually designed reward function [Xie et al., 2012] are compared in Figure 7. Finally, we applied the policy obtained by our method to photo artistic conversion system [Xie et al., 2011] (Figure 8). To further investigate our IRL-based method, we performed the user study on the aesthetic assessment of the traditional oriental ink painting simulation between the proposed A4 system and the brush stroke (Sumie) filter of the stateof-the-art commercial software (Adobe Photoshop CC 2014). We invited 318 individuals to take the online questionnaire survey. |
| Researcher Affiliation | Academia | Ning Xie , Tingting Zhao , Feng Tian , Xiaohua Zhang , and Masashi Sugiyama Tongji University, China. Tianjin University of Science and Technology, China. Bournemouth University, UK. Hiroshima Institute of Technology, Japan. The University of Tokyo, Japan. |
| Pseudocode | No | The paper describes algorithms like inverse RL and policy learning methods but does not provide them in pseudocode or a clearly labeled algorithm block. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the methodology described, nor does it include a link to a code repository. |
| Open Datasets | No | The paper describes the creation of a custom dataset by video-recording brush motions using a specially designed device: 'To learn a particular artist s stroke drawing style, we collect stroke data from brush motion and drawings on the canvas and then learn the reward function from the collected data. We designed a device shown in Figure 3 to video-record brush motion.' However, it does not provide concrete access information (link, DOI, repository, or citation) for this dataset to be publicly available. |
| Dataset Splits | No | The paper mentions 'Returns at each trial are computed over 300 training episode samples' and a 'user study over 318 candidates', but it does not specify explicit training/validation/test dataset splits (e.g., percentages, sample counts, or references to predefined splits) for model training or evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, required to replicate the experiments. |
| Experiment Setup | Yes | The discounted cumulative reward along h, called the return, is given by t=1 γt 1R(st, at, st+1), where γ [0, 1) is the discount factor for future rewards. To set the values of five parameters α1, α2, . . . , α5, we use the maximum-margin inverse reinforcement learning method [Abbeel and Ng, 2004]. Figure 6 plots the average return over 16 trials as the function of policy update iterations, obtained by the policies learned by our approach. Returns at each trial are computed over 300 training episode samples. This graph shows that the average return sharply increases in an early stage and then converges at about the 20th iteration. |