DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis

Authors: Youngjoong Kwon, Lingjie Liu, Henry Fuchs, Marc Habermann, Christian Theobalt

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on the Dyna Cap dataset [16], which is publicly available. Dyna Cap provides performance videos of 5 different subjects captured from 50 to 101 cameras, foreground mask, and skeletal pose corresponding to each frame, and the template mesh for each subject. The training and testing videos consist of around 20k and 5k frames at the resolution of 1K (1285 940), respectively. Four cameras are held out for the testing and the remaining ones are used for the training as proposed by the original dataset.
Researcher Affiliation Academia Youngjoong Kwon1 , Lingjie Liu2,3, Henry Fuchs1, Marc Habermann3,4 , Christian Theobalt3,4 1University of North Carolina at Chapel Hill. 2University of Pennsylvania. 3Max Planck Institute for Informatics, Saarland Informatics Campus. 4Saarbrücken Research Center for Visual Computing, Interaction and AI.
Pseudocode No The paper describes its method in detail and provides an overview figure, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The video results and code are available at https://vcai. mpi-inf.mpg.de/projects/DELIFFAS.
Open Datasets Yes We evaluate our method on the Dyna Cap dataset [16], which is publicly available.
Dataset Splits No The paper mentions training and testing splits: 'The training and testing videos consist of around 20k and 5k frames at the resolution of 1K (1285 940), respectively. Four cameras are held out for the testing and the remaining ones are used for the training as proposed by the original dataset.' However, it does not explicitly provide details for a separate validation dataset split.
Hardware Specification Yes DELIFFAS runs at 31fps when rendering a 1K (940 1285) video using a single A-100 GPU with an Intel Xeon CPU. We train on a single RTX 8000 48G GPU with a single batch size. When tested on an A40 graphics card, we achieve a real-time speed of 26fps.
Software Dependencies No The paper mentions using 'Adam optimizer', 'VGG-16 network', and 'Tensorflow's bi-linear sampling', but does not provide specific version numbers for these software dependencies or the main programming language.
Experiment Setup Yes We used the learning rate of 2 10 4. We train on a single RTX 8000 48G GPU with a single batch size. We supervise our approach by minimizing the following loss L = λ1 L1 +λperc Lperc, where L1 and λperc are the L1 loss and perceptual [22] loss, respectively. λ1 and λperc are their weights. For the first 940k iterations, λ1 is set to one, and λperc is set to zero. Then, we set λ1 and λperc so that the magnitude of each loss term is roughly the same and train for additional 350k iterations.