LightSpeed: Light and Fast Neural Light Fields on Mobile Devices
Authors: Aarush Gupta, Junli Cao, Chaoyang Wang, Ju Hu, Sergey Tulyakov, Jian Ren, László Jeni
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method offers superior rendering quality compared to previous light field methods and achieves a significantly improved trade-off between rendering quality and speed. (Abstract) and We benchmark our approach on the real-world forward-facing [22] [23], the realistic synthetic 360 datasets [23] and unbounded 360 scenes [3]. (Section 4, Datasets) and As in Tab. 1, we obtain better results on all rendering fidelity metrics on the two bounded datasets. (Section 4.1, Rendering Quality) |
| Researcher Affiliation | Collaboration | Aarush Gupta1 Junli Cao2, Chaoyang Wang2, Ju Hu2 Sergey Tulyakov2 Jian Ren2 László A Jeni1 1Robotics Institute, Carnegie Mellon University 2Snap Inc. Project page: https://lightspeed-r2l.github.io |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Project page: https://lightspeed-r2l.github.io |
| Open Datasets | Yes | We benchmark our approach on the real-world forward-facing [22] [23], the realistic synthetic 360 datasets [23] and unbounded 360 scenes [3]. (Section 4, Datasets) |
| Dataset Splits | No | The forward-facing dataset consists of 8 real-world scenes captured using cellphones, with 20-60 images per scene and 1/8th of the images used for testing. The synthetic 360 dataset has 8 scenes, each having 100 training views and 200 testing views. The unbounded 360 dataset consists of 5 outdoor and 4 indoor scenes with a central object and a detailed background. Each scene has between 100 to 300 images, with 1 in 8 images used for testing. (Section 4, Datasets) - While test splits are mentioned, no explicit validation split percentages or counts are given. |
| Hardware Specification | Yes | We report and compare average inference times per rendered frame on various mobile chips, including Apple A15, Apple M1 Pro and Snapdragon SM8450 chips (Section 4, Baselines and Metrics) and All our experiments are conducted on Nvidia V100s and A100s. (Appendix A, Training details) |
| Software Dependencies | No | The paper mentions Adam optimizer, Batch Norm, and GeLU activation, but does not specify version numbers for any software libraries or frameworks used (e.g., PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | We use Adam [18] optimizer with a batch size of 32 to train the feature grids and decoder network. We use an initial learning rate of 1e-5 with 100 warmup steps taking the learning rate to 5e-4. Beyond that, the learning rate decays linearly until the training finishes. (Appendix A, Training Details) and We train our frontal Light Speed models as well as each sub-scene model in non-frontal scenes for 200k iterations. (Section 4, Training Details) |