Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multimodal Causal Reasoning for UAV Object Detection
Authors: Nianxin Li, Mao Ye, Lihua Zhou, Shuaifeng Li, Song Tang, Luping Ji, Ce Zhu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several public datasets confirm the state-of-the-art performance of our approach. The code, data and models will be released upon publication of this paper. (Abstract) and 5 Experiments section. |
| Researcher Affiliation | Academia | 1University of Electronic Science and Technology of China 2CAIR, HKSIS, CAS 3University of Shanghai for Science and Technology, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes the proposed method in sections 4.1 and 4.2 using descriptive text and mathematical equations (e.g., Eq. 2, 3, 5, 6, 8, 9, 10, 11) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | https://github.com/lnxwow/MCR-UOD (provided at the top of the paper). |
| Open Datasets | Yes | Three public datasets are used for aerial image object detection: Vis Drone [8], UAVDT [7] and HRSC2016 [26]. |
| Dataset Splits | Yes | Vis Drone contains 8599 drone-captured images (2000 1500 pixels), split into 6471 for training, 548 for validation, and 1580 for testing... UAVDT is designed for object detection and tracking, comprising 24143 training images and 16592 testing images (1024 540 pixels). |
| Hardware Specification | Yes | We chose YOLOv8 [40] as the backbone of our method and performed all training and validation on two NVIDIA Fe Force RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using YOLOv8 [40], CLIP [34, 39], and GPT [1] models and refers to the YOLOv8 backbone, but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used in the implementation. |
| Experiment Setup | Yes | The number of training epochs is 75, with a batch size of 4. The initial learning rate lr0 is 0.001; the final learning rate lrf is 0.01; and the weight decay is 0.0005. |