PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping

Authors: Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our proposed framework is superior to SOTA methods in both visual quality and generalization.
Researcher Affiliation Academia 1Zhejiang University 2Nanjing University of Science and Technology
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. There is no explicit statement about code release or a repository link.
Open Datasets Yes Datasets. Following previous image stylization methods, we take Wiki Art (Karayev et al. 2013) as the style dataset. We conduct extensive experiments on real-world scenes, forward-facing LLFF (Mildenhall et al. 2019) and 360 unbounded Tanks and Temples dataset (Knapitsch et al. 2017).
Dataset Splits No The paper states: 'The training sets of LLFF dataset are Room, Horns, Leaves, Flower, Orchids, and we use Fern, Trex for evaluation. On Tanks and Temples dataset, we use Playground, Horse, Francis for training, and evaluate on Truck.' This specifies training and test sets but does not provide details about a separate validation split or how it was used.
Hardware Specification Yes All experiments are performed on a single NVIDIA RTX A6000 (48G) GPU.
Software Dependencies No The paper mentions using DVGO and SANet but does not provide specific version numbers for these or other software dependencies like Python or PyTorch.
Experiment Setup Yes Following (Sun, Sun, and Chen 2022), we use the Adam optimizer with a learning rate of 0.1 for all voxels and 0.001 for MLP. λrec and λcycle are set to 1. During stylization, we adopt SANet (Park and Lee 2019) as the image style transfer network. The visual prompt is trained for 5k iterations using an Adam optimizer with a learning rate of 0.1. λstyle is set to 0.1. We use relu1 1, relu2 1, relu3 1, and relu4 1 layers in VGG-19 to calculate loss in Eq. 7.