PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping
Authors: Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our proposed framework is superior to SOTA methods in both visual quality and generalization. |
| Researcher Affiliation | Academia | 1Zhejiang University 2Nanjing University of Science and Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. There is no explicit statement about code release or a repository link. |
| Open Datasets | Yes | Datasets. Following previous image stylization methods, we take Wiki Art (Karayev et al. 2013) as the style dataset. We conduct extensive experiments on real-world scenes, forward-facing LLFF (Mildenhall et al. 2019) and 360 unbounded Tanks and Temples dataset (Knapitsch et al. 2017). |
| Dataset Splits | No | The paper states: 'The training sets of LLFF dataset are Room, Horns, Leaves, Flower, Orchids, and we use Fern, Trex for evaluation. On Tanks and Temples dataset, we use Playground, Horse, Francis for training, and evaluate on Truck.' This specifies training and test sets but does not provide details about a separate validation split or how it was used. |
| Hardware Specification | Yes | All experiments are performed on a single NVIDIA RTX A6000 (48G) GPU. |
| Software Dependencies | No | The paper mentions using DVGO and SANet but does not provide specific version numbers for these or other software dependencies like Python or PyTorch. |
| Experiment Setup | Yes | Following (Sun, Sun, and Chen 2022), we use the Adam optimizer with a learning rate of 0.1 for all voxels and 0.001 for MLP. λrec and λcycle are set to 1. During stylization, we adopt SANet (Park and Lee 2019) as the image style transfer network. The visual prompt is trained for 5k iterations using an Adam optimizer with a learning rate of 0.1. λstyle is set to 0.1. We use relu1 1, relu2 1, relu3 1, and relu4 1 layers in VGG-19 to calculate loss in Eq. 7. |