Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

TGFormer: Transformer with Track Query Group for Multi-Object Tracking

Authors: Rui Zeng, Yuanzhou Huang, Songwei Pei

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our method achieves competitive performance on the MOT Challenge and Dance Track datasets. Extensive ablation experiments further demonstrate the effectiveness of our method.
Researcher Affiliation	Academia	School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, China {zengrui@, huangyuanzhou@, peisongwei@}bupt.edu.cn
Pseudocode	No	The paper describes the method using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for TGFormer, nor does it include links to a code repository.
Open Datasets	Yes	We conduct experiments on the MOT Challenge (Milan et al. 2016; Dendorfer et al. 2020) and Dance Track datasets (Sun et al. 2022). The MOT17 benchmark (Milan et al. 2016) includes 7 training sequences and 7 testing sequences... For this combined dataset, we train for 130 epochs... We also incorporate the Crowd Human (Shao et al. 2018) Validation set.
Dataset Splits	Yes	The MOT17 benchmark (Milan et al. 2016) includes 7 training sequences and 7 testing sequences... The MOT20 benchmark (Dendorfer et al. 2020) has 4 training and 4 testing sequences... For all ablation experiments, we split the sequences in the MOT17 training set into two halves, using one half as the training set and the other half as the validation set.
Hardware Specification	Yes	Training uses 4 NVIDIA A800 GPUs with a batch size of 1 per GPU, each batch containing a video clip with multiple frames.
Software Dependencies	No	TGFormer builds on Me MOTR (Gao and Wang 2023) with Res Net50 as the backbone and DAB-Deformable DETR pretrained on COCO as the detector. Training uses... The Adam W optimizer with a 2.0 10 4 learning rate is applied. This text lists software components but lacks specific version numbers for reproducibility.
Experiment Setup	Yes	Training uses 4 NVIDIA A800 GPUs with a batch size of 1 per GPU, each batch containing a video clip with multiple frames. The Adam W optimizer with a 2.0 10 4 learning rate is applied. Targets with scores below ̷̲̣̔̄̂͂̀̂̀͂̓ = 0.5 or Io U below ̷̲̣̔̄̂͂̀̂̀͂̓̂̂̃ = 0.5 are filtered... The confidence thresholds are set as ̷̲̣̔̄̂͂̀̂̀͂̓̈̇̂̂̀̂ = 0.85, ̷̲̣̔̄̂͂̀̂̀͂̓̂̀̂ = 0.7, ̷̲̣̔̄̂͂̀̂̀͂̓̅̃̇ = 0.5... For this combined dataset, we train for 130 epochs, reducing the learning rate tenfold at the 120th epoch. The number of clip frames increases from the original 2 frames to 3, 4, 5, and 6 frames at the 50th, 70th, 90th, and 120th epochs, respectively.