Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision
Authors: Aleksandra Franz, Barbara Solenthaler, Nils Thuerey
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first present two ablations to illustrate the importance of handling depth ambiguity, and of the prototype density volumes. We then evaluate the method in comparison to a series of learned and optimization-based methods for a synthetic and a real-world dataset. |
| Researcher Affiliation | Academia | Aleksandra Franz Technical University of Munich (TUM) franzer@in.tum.de Barbara Solenthaler ETH Zurich TUM Institute for Advanced Study solenthaler@inf.ethz.ch Nils Thuerey Technical University of Munich (TUM) nils.thuerey@tum.de |
| Pseudocode | No | The paper describes the model architecture and equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code is publicly available at https://github.com/tum-pbs/ Neural-Global-Transport and includes the data and configurations necessary to reproduce all results of the paper. |
| Open Datasets | Yes | We evaluate our method on both synthetic smoke flows and the real-world captures from the Scalar Flow dataset (Eckert et al., 2019). |
| Dataset Splits | No | The paper describes the data used for training and evaluation (e.g., 'remaining 105 frames' for synthetic, '130 frames per scene' for real world), but does not explicitly provide specific train/validation/test dataset splits with percentages, sample counts, or citations to predefined splits. |
| Hardware Specification | Yes | Our method is implemented in Tensor Flow version 1.12 under python version 3.6 and trained on a Nvidia Ge Force GTX 1080 Ti 11GB. |
| Software Dependencies | Yes | Our method is implemented in Tensor Flow version 1.12 under python version 3.6 and trained on a Nvidia Ge Force GTX 1080 Ti 11GB. |
| Experiment Setup | Yes | Density training Gρ is trained with Lρ = L ˆI + 2e-4LD + 1e-3Lz and a learning rate of 2e-4 with a decay of 2e-4, the decay is offset by -5000 iterations. We start at a grid resolution of 8x12x8 with 2 UNet levels. The resolution grows after 8k, 16k and 24k iterations by a factor of 2, adding a level of the UNet every time, thus reaching a maximum grid resolution of 64x96x64 with 5 levels. New levels are faded in over 3k iterations, starting 2k iterations after growth, by linearly interpolating between the up-sampled previous level and the current level. After fade-in only the output of the highest active level remains. The image resolution grows in conjunction with the density. |