Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

STAR: Spatial-Temporal Tracklet Matching for Multi-Object Tracking

Authors: Xuewei Bai, Yongcai Wang, Deying Li, Haodi Ping, LI Chunxu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on diverse datasets, including MOTChallenge, Dance Track, and Vis Drone2021-MOT, demonstrate the robustness and versatility of STAR, significantly improving tracking performance under challenging conditions. The code is available at https://github.com/baixuewei430-dotcom/STAR.
Researcher Affiliation Academia 1 School of Information, Renmin University of China 2 School of Computer Science, Beijing University of Technology 3 China Waterborne Transport Research Institute EMAIL EMAIL EMAIL
Pseudocode No The paper describes methodologies in paragraph text and uses mathematical equations for components like propagation (Section 3.3.2) and loss functions (Section 3.4), but it does not contain a dedicated pseudocode or algorithm block.
Open Source Code Yes The code is available at https://github.com/baixuewei430-dotcom/STAR.
Open Datasets Yes Extensive experiments on diverse datasets, including MOTChallenge, Dance Track, and Vis Drone2021-MOT, demonstrate the robustness and versatility of STAR, significantly improving tracking performance under challenging conditions. The code is available at https://github.com/baixuewei430-dotcom/STAR.
Dataset Splits Yes MOT17 includes 14 videos (7 for training) with three detection types: DPM [6], Faster R-CNN [7], and SDP [5]. MOT20 focuses on crowded scenes with 8 videos (4 for training and 4 for testing) that employ Faster R-CNN [7] detections. Dance Track [48] contains 100 videos of various group dances, while Vis Drone2021-MOT [49] comprises 96 sequences with around 40,000 frames across five object categories, posing challenges like occlusions and varying lighting conditions. ... The Vis Drone2021-MOT-train set, which consists of 56 sequences, is used for training, while the Vis Drone2021-MOT-test-dev set, containing 17 sequences, is used for testing. ... For the UAVDT dataset, 40 sequences are randomly selected for training, while 10 sequences are designated for testing.
Hardware Specification Yes We train our model on 24 NVIDIA RTX 2080Ti GPUs.
Software Dependencies No The proposed method is implemented using Py Torch. During training, 2N frames are sampled from each tracklet, resized to 256 128 pixels, and divided into two N-frame clips to enhance feature representation.
Experiment Setup Yes The input images are resized such that the shorter side is 800 pixels and the longer side is 1440 pixels. The proposed method is implemented using Py Torch. During training, 2N frames are sampled from each tracklet, resized to 256 128 pixels, and divided into two N-frame clips to enhance feature representation. The initial learning rate is set to 0.0003 and is reduced by a factor of 0.1 every 40 epochs. The model is trained for 150 epochs using the Adam optimizer with a mini-batch size of 32.