Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts

Authors: Linfeng Tang, Yeda Wang, Zhanchuan Cai, Junjun Jiang, Jiayi Ma

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that Control Fusion outperforms SOTA fusion methods in fusion quality and degradation handling, particularly under real-world and compound degradations.
Researcher Affiliation Academia 1Electronic Information School, Wuhan University 2School of Computer Science and Engineering, Macau University of Science and Technology 3Faculty of Computing, Harbin Institute of Technology
Pseudocode No The paper describes its methodology using mathematical formulations (e.g., Eq. 1-14) and network architectures (e.g., Fig. 2), but does not present a structured pseudocode block or algorithm.
Open Source Code Yes The source code is publicly available at https://github.com/Linfeng-Tang/Control Fusion.
Open Datasets Yes We selecte 2, 050 high-quality clear images from Road Scene [36], LLVIP [7], and MSRS [29] datasets.
Dataset Splits Yes Finally, the DDL-12 dataset contains approximately 48, 000 training image pairs and 4, 800 test image pairs.
Hardware Specification Yes All experiments are conducted on NVIDIA RTX 4090 GPUs with an Intel(R) Xeon(R) Platinum 8180 CPU (2.50 GHz) using the Py Torch framework.
Software Dependencies No All experiments are conducted on NVIDIA RTX 4090 GPUs with an Intel(R) Xeon(R) Platinum 8180 CPU (2.50 GHz) using the Py Torch framework.
Experiment Setup Yes Our image restoration and fusion network is built on a four-stage encoder decoder architecture, with channel dimensions increasing from 48 to 384, specifically configured as [48, 96, 192, 384]. The model is trained on the proposed DDL-12 dataset. During training, 224 224 patches are randomly cropped as inputs, with a batch size of 12 over 100 epochs. Optimization is performed using Adam W, starting with a learning rate of 1 10 3 and decayed to 1 10 5 via a cosine annealing schedule. For the loss configuration, λ1 and λ2 are set with a weight ratio of 1 : 3, and αint, αssim, αgrad, and αcolor are assigned values of 8 : 1 : 10 : 12, respectively.