OccFusion: Rendering Occluded Humans with Generative Diffusion Priors

Authors: Adam Sun, Tiange Xiang, Scott Delp, Fei-Fei Li, Ehsan Adeli

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Occ Fusion on ZJUMo Cap and challenging Oc Motion sequences and find that it achieves state-ofthe-art performance in the rendering of occluded humans.
Researcher Affiliation Academia Adam Sun , Tiange Xiang , Scott Delp, Li Fei-Fei , Ehsan Adeli Stanford University {adsun, xtiange}@stanford.edu
Pseudocode No The paper describes its pipeline in detail across three stages (Initialization, Optimization, Refinement) but does not include any formal pseudocode blocks or algorithms.
Open Source Code No Project page: https: //cs.stanford.edu/~xtiange/projects/occfusion/. ... We are not including code in our submission.
Open Datasets Yes We evaluate Occ Fusion on ZJUMo Cap and challenging Oc Motion sequences... ZJU-Mo Cap. ZJU-Mo Cap [44] is a dataset... Oc Motion. Oc Motion [15] comprises of 48 videos...
Dataset Splits No The paper specifies training on a subset of frames and evaluation on others, but it does not explicitly define a separate validation split or how it was used in the training process.
Hardware Specification Yes We train our entire pipeline for only 10 minutes on a single TITAN RTX GPU.
Software Dependencies Yes We use the pre-trained Stable Diffusion 1.5 model [48] with Control Net [76] plugins for SDS in all the stages.
Experiment Setup Yes In the Initialization Stage, instead of inpainting incomplete human masks directly, we run the pretrained diffusion model to inpaint RGB images with 10 inference steps and 1.0 Control Net conditioning scale. ... In the Optimization Stage, we train the 3D human Gaussian Π from scratch by following the objective Equation 5. We set λrgb = 1e4, λmask = 2e4, λssim = 1e3, and λlpips = 1e3. ... In this stage, we train Π for 1200 steps. ... In this stage, we finetune Π for another 1800 steps with Gaussian densification and pruning enabled for the first 1000 steps.