ED-NeRF: Efficient Text-Guided Editing of 3D Scene With Latent Space NeRF

Authors: JangHo Park, Gihyun Kwon, Jong Chul Ye

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results demonstrate that EDNe RF achieves faster editing speed while producing improved output quality compared to state-of-the-art 3D editing models. Code and rendering results are available at our project page1. 4 EXPERIMENTAL RESULTS 4.1 BASELINE METHODS 4.2 QUALITATIVE RESULTS 4.3 QUANTITATIVE RESULTS 4.4 ABLATION STUDIES
Researcher Affiliation Academia Jangho Park2 , Gihyun Kwon3 , Jong Chul Ye1,2,3 Kim Jaechul Graduate School of AI1, Robotics Program2, Department of Bio and Brain Engineering3, KAIST {jhq1234,cyclomon,jong.ye}@kaist.ac.kr
Pseudocode No The paper does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code and rendering results are available at our project page1. 1https://jhq1234.github.io/ed-nerf.github.io/
Open Datasets Yes We utilized a database comprising real-world images, including LLFF (Mildenhall et al., 2019) and IBRNet (Wang et al., 2021) datasets, as well as the human face dataset employed in Instruction-Ne RF2Ne RF (Haque et al., 2023).
Dataset Splits No The paper mentions training data and fine-tuning steps but does not explicitly provide percentages or counts for training, validation, or test dataset splits.
Hardware Specification Yes GPU Memory and training time are measured based on the RTX 3090.
Software Dependencies No The paper mentions using TensoRF and Adam optimizer but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup Yes For optimizing Ne RF in Latent space, we use Tenso RF as the backbone for fast and efficient rendering with up to 64 resolution by supervision. As stated in the original Tenso RF, we use Adam optimizer. ... we set the learning rate for trainable density voxels, which form the sigma in the original Tenso RF, to 0.04, while keeping the learning rates for other components the same as Tenso RF at 0.02 during training. During the training process, ED-Ne RF needed to form a feature map of size 64 resolution, therefore we configured the batch size to be 4096-pixel rays. We train ED-Ne RF in latent space with source latent feature map supervision for 100k steps ... When fine-tuning ED-Ne RF for editing purposes, we train ED-Ne RF for 5k steps. We set λom as 100 and λim as 0.01...