InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image
Authors: Jianhui Li, Shilong Liu, Zidong Liu, Yikai Wang, Kaiwen Zheng, Jinghui Xu, Jianmin Li, Jun Zhu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments verify the effectiveness of our method and show its superiority against strong baselines quantitatively and qualitatively. Source code and pretrained models can be found on our project page: https://mybabyyh.github.io/Instruct Pix2Ne RF. |
| Researcher Affiliation | Collaboration | Jianhui Li1, Shilong Liu1, Zidong Liu1, Yikai Wang1, Kaiwen Zheng1, Jinghui Xu2 Jianmin Li1 , Jun Zhu1,2 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua-Bosch Joint ML Center, Tsinghua University, Beijing, 100084 China 2Shengshu Technology, Beijing |
| Pseudocode | No | The paper describes the architecture and method steps using text and diagrams but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code and pretrained models can be found on our project page: https://mybabyyh.github.io/Instruct Pix2Ne RF. To facilitate progress in the field, we will be completely open-sourcing the model, training code, and the data we have curated. |
| Open Datasets | Yes | We train our conditional diffusion model on the dataset we prepared from FFHQ (Karras et al., 2019) and use Celeb A-HQ (Karras et al., 2018) for evaluation. |
| Dataset Splits | No | The paper defines a test set ('The image test dataset is the first 300 images from Celeb A-HQ (Karras et al., 2018)') but does not explicitly specify comprehensive training, validation, and test splits with percentages or counts for the FFHQ dataset used for training, nor does it specify a distinct validation set beyond the test set used for evaluation. |
| Hardware Specification | Yes | We set tth = 600, λid = 0.1 and trained the model on a 4-card NVIDIA Ge Force RTX 3090 for 6 days with a batch size of 20 on a single card. |
| Software Dependencies | No | The paper mentions using pretrained models (e.g., EG3D, PREIM3D, CLIP) and various architectures (Diffusion Transformer, UNet, transformers), but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We set tth = 600, λid = 0.1 and trained the model on a 4-card NVIDIA Ge Force RTX 3090 for 6 days with a batch size of 20 on a single card. In our paper, we set p1 = 0.05, p2 = 0.05 as hyperparameters. |