DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis
Authors: Youngjoong Kwon, Lingjie Liu, Henry Fuchs, Marc Habermann, Christian Theobalt
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on the Dyna Cap dataset [16], which is publicly available. Dyna Cap provides performance videos of 5 different subjects captured from 50 to 101 cameras, foreground mask, and skeletal pose corresponding to each frame, and the template mesh for each subject. The training and testing videos consist of around 20k and 5k frames at the resolution of 1K (1285 940), respectively. Four cameras are held out for the testing and the remaining ones are used for the training as proposed by the original dataset. |
| Researcher Affiliation | Academia | Youngjoong Kwon1 , Lingjie Liu2,3, Henry Fuchs1, Marc Habermann3,4 , Christian Theobalt3,4 1University of North Carolina at Chapel Hill. 2University of Pennsylvania. 3Max Planck Institute for Informatics, Saarland Informatics Campus. 4Saarbrücken Research Center for Visual Computing, Interaction and AI. |
| Pseudocode | No | The paper describes its method in detail and provides an overview figure, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The video results and code are available at https://vcai. mpi-inf.mpg.de/projects/DELIFFAS. |
| Open Datasets | Yes | We evaluate our method on the Dyna Cap dataset [16], which is publicly available. |
| Dataset Splits | No | The paper mentions training and testing splits: 'The training and testing videos consist of around 20k and 5k frames at the resolution of 1K (1285 940), respectively. Four cameras are held out for the testing and the remaining ones are used for the training as proposed by the original dataset.' However, it does not explicitly provide details for a separate validation dataset split. |
| Hardware Specification | Yes | DELIFFAS runs at 31fps when rendering a 1K (940 1285) video using a single A-100 GPU with an Intel Xeon CPU. We train on a single RTX 8000 48G GPU with a single batch size. When tested on an A40 graphics card, we achieve a real-time speed of 26fps. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer', 'VGG-16 network', and 'Tensorflow's bi-linear sampling', but does not provide specific version numbers for these software dependencies or the main programming language. |
| Experiment Setup | Yes | We used the learning rate of 2 10 4. We train on a single RTX 8000 48G GPU with a single batch size. We supervise our approach by minimizing the following loss L = λ1 L1 +λperc Lperc, where L1 and λperc are the L1 loss and perceptual [22] loss, respectively. λ1 and λperc are their weights. For the first 940k iterations, λ1 is set to one, and λperc is set to zero. Then, we set λ1 and λperc so that the magnitude of each loss term is roughly the same and train for additional 350k iterations. |