Generative Neural Fields by Mixtures of Neural Implicit Functions
Authors: Tackgeun You, Mijeong Kim, Jungtaek Kim, Bohyung Han
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments This section demonstrates the effectiveness of the proposed approach, referred to as m INF, and discusses the characteristics of our algorithm based on the results. We run all experiments on the Vessl environment [2], and describe the detailed experiment setup for each benchmark in the appendix. 5.1 Datasets and Evaluation Protocols We adopt Celeb A-HQ 642 [21], Shape Net 643 [8] and SRN Cars [34] dataset for image, voxel and neural radiance field (Ne RF) generation, respectively, where 642 and 643 denotes the resolution of samples in the dataset. We follow the protocol from Functa [13] for image and Ne RF scene and generative manifold learning (GEM) [12] for voxel. We adopt the following metrics for performance evaluation. To measure the reconstruction quality, we use mean-squared error (MSE), peak signal-to-noise ratio (PSNR), reconstruction Fréchet inception distance (r FID), reconstruction precision (r Precision) and reconstruction recall (r Recall). In image generation, we use Fréchet inception distance (FID) score [17], precision, recall [32, 27] and F1 score between sampled images and images in a train split. Voxel generation performance is evaluated by coverage and maximum mean discrepancy (MMD) metrics [1] on a test split. In Ne RF scene generation, we use FID score between rendered images and images in a test split for all predefined views for evaluation. |
| Researcher Affiliation | Academia | Tackgeun You3 tackgeun.you@postech.ac.kr Mijeong Kim1 mijeong.kim@snu.ac.kr Jungtaek Kim4 jungtaek.kim@pitt.edu Bohyung Han1,2 bhhan@snu.ac.kr 1ECE & 2IPAI, Seoul National University, South Korea 3CSE, POSTECH, South Korea 4University of Pittsburgh, USA |
| Pseudocode | Yes | Algorithm 1 Meta-learning with m NIF |
| Open Source Code | No | The paper states 'Our implementation is based on Py Torch 1.92 with CUDA 11.3 and Py Torch Lightning 1.5.7.3 Our major routines are based on the code repositories for Functa,4 SIREN,5 latent diffusion model,6 guided diffusion7 and HQ-Transformer.8'. It lists libraries and other projects used, but it does not provide an explicit statement of releasing their own code for the proposed mNIF method, nor does it provide a link to their own repository. |
| Open Datasets | Yes | We adopt Celeb A-HQ 642 [21], Shape Net 643 [8] and SRN Cars [34] dataset for image, voxel and neural radiance field (Ne RF) generation, respectively, where 642 and 643 denotes the resolution of samples in the dataset. |
| Dataset Splits | Yes | We divide entire images into 27,000 images for train and 3,000 images for test split which is provided by Functa [13]. It has 35,019 samples in the train split and 8,762 samples in the test split. SRN cars dataset has a train, validation, and test split. Train split has 2,458 scenes with 128 128 resolution images from 50 random views. Test split has 704 scenes with 128 128 images from 251 fixed views in the upper hemisphere. |
| Hardware Specification | Yes | The efficiency of m NIF are evaluated on NVIDIA Quadro RTX 8000. |
| Software Dependencies | Yes | Our implementation is based on Py Torch 1.92 with CUDA 11.3 and Py Torch Lightning 1.5.7. |
| Experiment Setup | Yes | We use m NIF configuration with the number of hidden layers L = 5, the channel dimension of hidden layers W = 128, the number of mixtures M = 384 and the latent vector dimension H = 512. throughout our experiments. We use meta learning for training the m NIF on Celeb A-HQ 642. We use Adam [22] optimizer with the outer learning rate ϵouter = 1.0 10 4 and batch size 32. We use a cosine annealing learning rate schedule without a warm-up learning rate. We take the best m NIF after optimizing the model over 800 epochs with inner learning rates ϵinner {10.0, 1.0, 0.1}. |