Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting
Authors: Kai He, Ruofan Liang, Jacob Munkberg, Jon Hasselgren, Nandita Vijaykumar, Alexander Keller, Sanja Fidler, Igor Gilitschenski, Zan Gojcic, Zian Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the perceptual quality of our relighting method compared to the baseline methods on the MIT multi-illumination dataset. Participants were shown a ground-truth relit image alongside two relighting results one generated by our method and the other by a baseline model, with the order randomly shuffled. They were asked to select the result that more closely resembled the ground truth, considering aspects such as transparency, shadows, and reflections... In Table 1, we present a comparative analysis of our method against all baselines on both Synthetic Scenes and the MIT multi-illumination benchmark [48]... We ablate our joint modeling approach with quantitative and qualitative results in Table 3 and Figure 5. |
| Researcher Affiliation | Collaboration | Kai He1,2,3 Ruofan Liang1,2,3 Jacob Munkberg1 Jon Hasselgren1 Nandita Vijaykumar2,3 Alexander Keller1 Sanja Fidler1,2,3 Igor Gilitschenski2,3 Zan Gojcic1 Zian Wang1,2,3 1NVIDIA 2University of Toronto 3Vector Institute |
| Pseudocode | No | The paper describes methods in prose and uses diagrams like Figure 2 for illustration but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps. |
| Open Source Code | No | We plan to release the code and data upon acceptance. The internal guidelines of our institution prevent us from releasing code at this stage. |
| Open Datasets | Yes | The multi-illumination dataset of Murmann et al. [48] consists of 985 real-world indoor scenes for training and 30 scenes for testing... Our synthetic dataset consists of rendered video clips... This dataset includes 3D assets from Poly Haven [67] and Objaverse [16]... |
| Dataset Splits | Yes | The multi-illumination dataset of Murmann et al. [48] consists of 985 real-world indoor scenes for training and 30 scenes for testing... Synthetic Scenes features 40 scenes... We cycle through the lighting conditions, selecting one video as the input, and the following as the relighting target, resulting in four different relighting tasks per scene. |
| Hardware Specification | Yes | The total training of two stages takes around 4 days on 32 A100 GPUs... The overall inference time for performing 35 denoising steps... is 445.5 seconds, measured on a single A100 GPU. |
| Software Dependencies | No | We fine-tune our models based on Cosmos-Predict1-7B-Video2World [49], a pre-trained Di T video diffusion model. All training is done with a batch size of 64, using the Adam W optimizer... (The paper mentions specific models and optimizers but not core software libraries with versions like Python, PyTorch, CUDA, etc.) |
| Experiment Setup | Yes | All training is done with a batch size of 64, using the Adam W optimizer with a learning rate of 2 10 5, with mixed-precision (BF16) training at a resolution of 480 848 pixels. The Adam W optimizer was employed with a weight decay of 0.1. The exponential decay rates for the moment estimates β are set to 0.9 for the first moment and 0.99 for the second moment, with ϵ at 1 10 10. During inference, we use 35 denoising steps. |