NVFi: Neural Velocity Fields for 3D Physics Learning from Dynamic Videos
Authors: Jinxi Li, Ziyang Song, Bo Yang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on multiple datasets, demonstrating the superior performance of our method over all baselines, particularly in the critical tasks of future frame extrapolation and unsupervised 3D semantic scene decomposition. Our code and data are available at https://github.com/v LAR-group/NVFi |
| Researcher Affiliation | Academia | v LAR Group, The Hong Kong Polytechnic University jinxi.li@connect.polyu.hk ziyang.song@connect.polyu.hk bo.yang@polyu.edu.hk |
| Pseudocode | Yes | Algorithm 1 At a specific interframe timestamp ti, given a light ray ri with viewing angle (θ,φ) and S sample points {p1 ps p S} along the ray, the objective of this algorithm is to determine the color and density values for the S points along ri, denoted as: {(c1,σ1) (cs,σs) (c S,σS)}. In the meantime, we also have the keyframe dynamic radiance field fΘ and velocity field gΦ. Note that, we shall not directly query fΘ to obtain color and density for the S points because: 1) the dynamic radiance field fΘ is never trained on interframe timestamps, thus the queried values are inaccurate; 2) the velocity field gΦ will not be involved, and therefore the interframes cannot provide additional constraints to optimize gΦ. |
| Open Source Code | Yes | Our code and data are available at https://github.com/v LAR-group/NVFi |
| Open Datasets | Yes | To validate our method, we further introduce two dynamic 3D datasets: 1) Dynamic Object dataset, and 2) Dynamic Indoor Scene dataset. We conduct extensive experiments on multiple datasets... Our code and data are available at https://github.com/v LAR-group/NVFi |
| Dataset Splits | No | The paper details training and testing splits for interpolation and extrapolation, but does not explicitly mention or define a separate 'validation' split. |
| Hardware Specification | Yes | All scenes are trained for 1.5 hours on a single NVIDIA RTX 3090 GPU respectively. |
| Software Dependencies | No | The paper mentions 'functorch' and 'Adam' but does not specify their version numbers or other software dependencies with versions. |
| Experiment Setup | Yes | During training, our keyframe radiance field starts with a space grid size of 643 and increases its resolution in log scale at 2k, 4k, 6k, 8k, 10k iterations till 2003. The learning rate for feature planes is 0.02, and the learning rate for color decoding neural network and velocity field is 0.001. All learning rates are exponentially decayed to 1/10 at final iteration 30k. We use Adam[29] for optimization with β1 = 0.9,β2 = 0.99. We sample 262144 points uniformly in the space [ 1,1]3 and time [0,1] for dynamic object datasets, and 1310672 points for dynamic indoor scene datasets every iteration. The jacobians of velocity required is calculated by using autograd from functorch[26]. For all the sampled points, we only evaluate the physics losses at occupied region, where the grid alpha α = 1 exp σ 0.01 0.0001. We set the loss weight for divergence-free loss as 5, and the weight for momentum conservation loss as 0.1. |