Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
Authors: Pengwei Liu, Hangjie Yuan, Bo Dong, Jiazheng Xing, Jinwang Wang, Rui Zhao, Weihua Chen, Fan Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments 5.1 Experimental Details 5.2 Main Results 5.3 Ablation Study Table 1: Quantitative comparison. Bold number indicate the best performance. Evaluation metrics. We evaluate relighting performance across three key dimensions: (1) visual fidelity: We assess image quality using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). |
| Researcher Affiliation | Collaboration | 1Zhejiang University, 2DAMO Academy, Alibaba Group, 3Hupan Lab, 4National University of Singapore |
| Pseudocode | Yes | Algorithm 1 Loss Sampling Strategy per Iteration |
| Open Source Code | Yes | Code is available at https://github.com/alibaba-damo-academy/Lumos-Custom. |
| Open Datasets | Yes | Built on Panda70M [8], we curate 110K high-quality video pairs and augment training with 1.2M additional relit images using IC-Light. we conducted additional evaluations on two public object-centric relighting benchmarks: Stanford Orb [24] and Navi [19] |
| Dataset Splits | Yes | For testing, we selected samples from the internal dataset, processed using the method described in Sec. B. These samples were evenly split: half for image generation at 768x512 resolution, and half for video generation at 480p resolution (832x480), with each video sample containing 49 frames. |
| Hardware Specification | Yes | All the models are trained with a batch size of 8 for 5,000 iterations on 8 NVIDIA H20 GPUs (with 96GB RAM). |
| Software Dependencies | No | The paper mentions specific models like "Qwen2.5-VL [1]", "Wan2.1-T2V-1.3B-480P [38]", and "Lotus [17]", but does not provide specific version numbers for general software dependencies such as programming languages (e.g., Python version) or libraries (e.g., PyTorch, CUDA versions). |
| Experiment Setup | Yes | We adopt the Adam W optimizer with the learning rate of 1e-5 for training the entire framework. All the models are trained with a batch size of 8 for 5,000 iterations on 8 NVIDIA H20 GPUs (with 96GB RAM). We adopt fixed weights of λ0 = 1.0 and λ1 = λ2 = 0.1 for all experiments. |