QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos
Authors: Sharath Girish, Tianye Li, Amrita Mazumdar, Abhinav Shrivastava, david luebke, Shalini De Mello
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach, QUEEN, on two benchmark datasets, containing diverse scenes with large geometric motion and illumination changes. QUEEN outperforms all prior state-of-the-art online FVV methods on all metrics. Notably, for several highly dynamic scenes, it reduces the model size to just 0.7 MB per frame while training in under 5 sec and rendering at 350 FPS. |
| Researcher Affiliation | Collaboration | Sharath Girish University of Maryland sgirish@cs.umd.edu Tianye Li NVIDIA tianyel@nvidia.com Amrita Mazumdar NVIDIA amritam@nvidia.com Abhinav Shrivastava University of Maryland abhinav@cs.umd.edu David Luebke NVIDIA dluebke@nvidia.com Shalini De Mello NVIDIA shalinig@nvidia.com |
| Pseudocode | No | The paper describes methods in text and figures, but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | We aim to release the code in the future. |
| Open Datasets | Yes | We evaluate our method on two challenging FVV video datasets. (1) Neural 3D Videos (N3DV) [41] consists of six indoor scenes with forward-facing 20-view videos. (2) Immersive Videos [4] consists of seven indoor and outdoor scenes captures with 46 cameras. |
| Dataset Splits | No | The paper states: 'In both datasets, the central view is held out for testing.' and describes training on the remaining views. It does not explicitly define a separate validation dataset split for hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | We train for 500 and 350 epochs for the first time-step, and for 10 and 15 epochs for the subsequent time-steps, for N3DV and Immersive, respectively, on an NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions building its implementation on [29] (3D Gaussian Splatting) and using the Adam optimizer [30], but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We train for 500 and 350 epochs for the first time-step, and for 10 and 15 epochs for the subsequent time-steps, for N3DV and Immersive, respectively... We set the SH degree to 2 for N3DV and 3 for Immersive. We set the score vector threshold td = 0.001 for all experiments... The position residual learning rate is set to 0.00016 for N3DV and 0.0005 for Immersive. Other hyperparameters are provided in Table 11. |