Implicit Neural Representations with Levels-of-Experts
Authors: Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we validate Lo E on 4 challenging tasks. In the first two experiments, we fit our model to high-resolution image and video data, evaluate its performance, and study the effect of various design components. Then in Section. 4.3, we evaluate our model on the indirectly supervised, novel-view synthesis task and study its inductive bias. Finally, in Section. 4.4, we demonstrate its generalization capability by training a generative adversarial network (GAN). |
| Researcher Affiliation | Collaboration | Zekun Hao : hz472@cornell.edu Arun Mallya NVIDIA amallya@nvidia.com Serge Belongie: :Cornell University sjb344@cornell.edu Ming-Yu Liu mingyul@nvidia.com |
| Pseudocode | No | The paper does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | All the code will be made publicly available. |
| Open Datasets | Yes | We study the effect of our hierarchical weight tiling on model capacity and computational efficiency by fitting networks to a high-resolution image of size 8192ˆ8192 [25] pixels. We fit our model to a video [44] with 300 frames and a resolution of 512 ˆ 512. We compare our method with baselines on the Tanks and Temples dataset [21, 24]. The model is trained on the Flickr Faces-HQ (FFHQ) dataset [20] at a resolution of 256 ˆ 256. |
| Dataset Splits | Yes | Fully specified in the supplemental material. |
| Hardware Specification | No | The main paper states that the 'total amount of compute and the type of resources used' are 'Included in the supplemental material,' but it does not specify any hardware details (like specific GPU or CPU models) within the main text. |
| Software Dependencies | No | The main paper does not provide specific software dependencies with version numbers for key components. It mentions 'Fully specified in the supplemental material' for training details, but no specific software versions are detailed in the main text. |
| Experiment Setup | Yes | For the generator, we use an 8-layer network with residual connections. The noise vector is directly fed into the first layer. For the discriminator, we use a multi-resolution patch discriminator [18] with spectral normalization [31]. The model is trained on the Flickr Faces-HQ (FFHQ) dataset [20] at a resolution of 256 ˆ 256. We use hinge loss [22] as the GAN objective. All the models have 4 hidden layers and 256 hidden channels (from Table 2 caption). |