Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws
Authors: Lin Guo, Xiaoqing Luo, Wei Xie, Zhancheng Zhang, Hui Li, Rui Wang, Zhenhua Feng, Xiaoning Song
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the proposed method achieves state-of-the-art fusion performance in qualitative and quantitative evaluations across multiple datasets and significantly improves semantic segmentation metrics. This fully demonstrates the advantages of this generative image fusion method, drawing inspiration from human cognition, in enhancing structural consistency and detail quality. |
| Researcher Affiliation | Academia | 1School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China 2School of Electronic and Information Engineering Suzhou University of Science and Technology, Suzhou, China EMAIL EMAIL {zczhang}@usts.edu.cn |
| Pseudocode | Yes | B Algorithm HCLFuse first applies an optimal-transport-based mapping T to the infrared image X, aligning its distribution with that of the visible image Y and thereby improving the optimization lower bound of the mutual-information objective. The aligned pair (T (X), Y ) is then fed into a multi-scale, mask-regulated variational bottleneck encoder (VBE) to compress and model the latent representation z, so that z captures modality-discriminative and compact features under an unsupervised learning setting. Subsequently, z is refined through a reverse-time diffusion generation process, in which physically guided constraints are dynamically injected at each denoising timestep to regulate the evolution of latent features. Finally, the optimized latent representation z0 is decoded to produce the fused image F. The pseudocode implementations of both the training and inference procedures are provided in Algorithm 1 and Algorithm 2, respectively. |
| Open Source Code | Yes | The source code is available at https://github.com/lxq-jnu/HCLFuse |
| Open Datasets | Yes | HCLFuse is evaluated on four public datasets: MSRS [30], TNO [31], FMB [21] and MFNet [32], covering diverse conditions such as urban driving, nighttime military scenes, and adverse weather. |
| Dataset Splits | No | In the experiments, a subset is sampled to ensure diversity and representative coverage: 361 pairs are selected from MSRS, 42 pairs from TNO, 280 pairs from FMB, and 393 pairs from MFNet. These selected subsets are used to validate the generalization capability of HCLFuse across varying scenes and lighting conditions. |
| Hardware Specification | Yes | All experimental evaluations are performed on a computational platform equipped with an NVIDIA Ge Force RTX 3090 GPU and an Intel(R) Core(TM) i7-6850K CPU operating at 3.60 GHz. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and the Mask2Former framework but does not specify version numbers for any software libraries or dependencies, such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The Adam optimizer with a learning rate of 2 10 5 is used for parameter updates. |