reproducibilityindex.ai

Stroke-Based Stylization Learning and Rendering with Inverse Reinforcement Learning

Authors: Ning Xie, Tingting Zhao, Feng Tian, Xiao Hua Zhang, Masashi Sugiyama

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments, we demonstrate that our system can successfully learn artists styles and render pictures with consistent and smooth brush strokes. Figure 6 plots the average return over 16 trials as the function of policy update iterations, obtained by the policies learned by our approach. Returns at each trial are computed over 300 training episode samples. Stroke drawing results by an artist, the agent trained with the learned reward function, and the agent trained with the manually designed reward function [Xie et al., 2012] are compared in Figure 7. Finally, we applied the policy obtained by our method to photo artistic conversion system [Xie et al., 2011] (Figure 8). To further investigate our IRL-based method, we performed the user study on the aesthetic assessment of the traditional oriental ink painting simulation between the proposed A4 system and the brush stroke (Sumie) ﬁlter of the stateof-the-art commercial software (Adobe Photoshop CC 2014). We invited 318 individuals to take the online questionnaire survey.
Researcher Affiliation	Academia	Ning Xie , Tingting Zhao , Feng Tian , Xiaohua Zhang , and Masashi Sugiyama Tongji University, China. Tianjin University of Science and Technology, China. Bournemouth University, UK. Hiroshima Institute of Technology, Japan. The University of Tokyo, Japan.
Pseudocode	No	The paper describes algorithms like inverse RL and policy learning methods but does not provide them in pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	No	The paper describes the creation of a custom dataset by video-recording brush motions using a specially designed device: 'To learn a particular artist s stroke drawing style, we collect stroke data from brush motion and drawings on the canvas and then learn the reward function from the collected data. We designed a device shown in Figure 3 to video-record brush motion.' However, it does not provide concrete access information (link, DOI, repository, or citation) for this dataset to be publicly available.
Dataset Splits	No	The paper mentions 'Returns at each trial are computed over 300 training episode samples' and a 'user study over 318 candidates', but it does not specify explicit training/validation/test dataset splits (e.g., percentages, sample counts, or references to predefined splits) for model training or evaluation.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud instance types.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, required to replicate the experiments.
Experiment Setup	Yes	The discounted cumulative reward along h, called the return, is given by t=1 γt 1R(st, at, st+1), where γ [0, 1) is the discount factor for future rewards. To set the values of ﬁve parameters α1, α2, . . . , α5, we use the maximum-margin inverse reinforcement learning method [Abbeel and Ng, 2004]. Figure 6 plots the average return over 16 trials as the function of policy update iterations, obtained by the policies learned by our approach. Returns at each trial are computed over 300 training episode samples. This graph shows that the average return sharply increases in an early stage and then converges at about the 20th iteration.