E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
Authors: Jiaqing Zhang, Mingxiang Cao, Weiying Xie, Jie Lei, Daixun Li, Wenbo Huang, Yunsong Li, Xue Yang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive testing on multiple public datasets reveals E2E-MFD s superior capabilities, showcasing not only visually appealing image fusion but also impressive detection outcomes, such as a 3.9% and 2.0% m AP50 increase on horizontal object detection dataset M3FD and oriented object detection dataset Drone Vehicle, respectively, compared to state-of-the-art approaches. |
| Researcher Affiliation | Academia | Jiaqing Zhang1, Mingxiang Cao1, Weiying Xie1 , Jie Lei2, Daixun Li1, Wenbo Huang3, Yunsong Li1, Xue Yang4 1The State Key Laboratory of Integrated Services Networks, Xidian University 2University of Technology Sydney 3Southeast University 4Shanghai AI Laboratory |
| Pseudocode | No | The paper describes methods and processes but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | https://github.com/icey-zhang/E2E-MFD |
| Open Datasets | Yes | We conduct experiments on four widely-used visible-infrared image datasets: TNO [53], Road Scene [26], M3FD [17] and Drone Vehicle [3]. |
| Dataset Splits | Yes | M3FD is adopted to evaluate both MF and OD performance. Road Scene with 37 image pairs, TNO with 42 image pairs and M3FD with 300 pairs are only used for the MF task in the testing stage, and the MF network is trained by the M3FD dataset which is divided into a training set (2,940 image pairs) and a testing set (1,260 image pairs). Besides, Drone Vehicle consists of 28,439 image pairs is utilized to train and test MF and OD for oriented objects. |
| Hardware Specification | Yes | We conduct all the experiments with one Ge Force RTX 3090 GPU |
| Software Dependencies | Yes | the code of M3FD is based on Detectron2 [54], while the code of Drone Vehicle is based on MMDetection 2.26.0 [55] and MMRotate 0.3.4 [56]. |
| Experiment Setup | Yes | In the training phase, E2E-MFD is optimized by Adam W with a batch size of 1. We set the learning rate to 2.5e 5 and the weight decay as 1e 4. The default training iteration is only 15,000. On the Drone Vehicle dataset, the pretrained LSKNet [57] is used for the initialization of the object detection network, and we fine-tune it for 12 epochs with a batch size of 4. The E2E-MFD is optimized by Adam W and the learning rate and the weight decay is set to 1e 4 and 0.05. |