Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Dynamic Semantic-Aware Correlation Modeling for UAV Tracking
Authors: Xinyu Zhou, Tongxin Pan, Lingyi Hong, Pinxue Guo, HaiJing Guo, Zhaoyu Chen, Kaixun Jiang, Wenqiang Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results validate the effectiveness of our method, achieving competitive performance on multiple UAV tracking datasets. |
| Researcher Affiliation | Academia | 1College of Computer Science and Artificial Intelligence, Fudan University 2College of Intelligent Robotics and Advanced Manufacturing, Fudan University EMAIL, EMAIL |
| Pseudocode | No | The paper describes the overall framework of DSATrack in Figure 2 and provides detailed mathematical formulas and descriptions for its components like the Dynamic Semantic Relevance Generator and Hybrid Attention. However, it does not contain a block explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | The code is available at https://github.com/zxyyxzz/DSATrack. |
| Open Datasets | Yes | We train the model on Tracking Net [44], GOT-10k [26], La SOT [17], and COCO [39]. |
| Dataset Splits | Yes | We train the model on Tracking Net [44], GOT-10k [26], La SOT [17], and COCO [39]. The experimental results and comparisons in Section 4.2, 4.3, and 4.4 are conducted on DTB70, UAVDT, Vis Drone2018, and UAV123 benchmarks. These are standard benchmarks with predefined splits for evaluation. |
| Hardware Specification | Yes | All training tasks are conducted on 4 NVIDIA Ge Force RTX 3090 GPUs with a batch size of 24. For fairness, the speed tests of other models and our model are conducted on the same NVIDIA RTX 3090 GPU without data loading overhead. These experiments were performed using a Jetson AGX Xavier edge computing platform to emulate realistic onboard conditions with limited computational resources. |
| Software Dependencies | No | The paper mentions specific frameworks and optimizers like 'Vi T-B [15] as the backbone', 'MAE [23] initialized', 'Adam W optimizer', and 'L1 loss, Focal loss, GIo U loss and ratio loss'. However, it does not provide specific version numbers for underlying software such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Model. DSATrack employs the Vi T-B [15] as the backbone, initialized with MAE [23]... Templates are resized to 128 128, while Search Region are resized to 256 256. Training. ...trained for 300 epochs with 3 templates. The learning rate for the Dynamic Semantic-aware Transformer and prediction head is set to 4 10 4, the learning rate for the remaining parameters is set to 4 10 5. At epoch 240, the learning rate is decayed by a factor of 10. In the second stage, the model is finetuned for 50 epochs with 6 templates. Adam W optimizer is used with a weight decay of 10 4. The loss function comprises the L1 loss, Focal loss, GIo U loss and ratio loss, consistent with RFGM [60]. ...batch size of 24. Inference. The size of our Templates & Patches is set to 3 Nz. The update interval of Patches is set to 5 for t 100, doubled every 100 frames until t = 500, and then remains 160. At the 4th, 7th, and 10th layers, the number of retained template tokens is set to 3 Nz 0.9 , 3 Nz 0.8 , and 3 Nz 0.7 , denotes the floor operation. |