Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer

Authors: Jinyang Li, En Yu, Sijia Chen, Wenbing Tao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our method surpasses previous trackers on the open-vocabulary MOT benchmark while also achieving faster inference speeds and significantly reducing preprocessing requirements. The paper includes sections like "4 EXPERIMENTS", "4.1 DATASETS AND EVALUATION METRICS", "4.3 PERFORMANCE COMPARISON ON TAO DATASET", and presents numerous tables with performance metrics (e.g., Table 1, Table 2, Table 3) comparing proposed methods against state-of-the-art.
Researcher Affiliation	Academia	Jinyang Li En Yu Sijia Chen Wenbing Tao Huazhong University of Science and Technology EMAIL
Pseudocode	No	The paper describes the methods and architecture using figures, diagrams, and textual explanations, including mathematical formulations for losses and strategies. However, it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Models and code are released at https://github.com/jinyanglii/OVTR.
Open Datasets	Yes	Experimental results on the TAO (Dave et al., 2020) dataset demonstrate that OVTR outperforms state-of-the-art methods... Additionally, in the KITTI (Geiger et al., 2012) transfer experiment... For training, we leveraged the LVIS dataset, which includes 1,203 categories... We compare open-vocabulary MOT performance on OVT-B dataset (Liang & Han).
Dataset Splits	Yes	The KITTI dataset, comprising 21 training and 29 test sequences, focuses on autonomous driving scenarios with diverse objects... For evaluation, we use the TAO validation dataset and designate certain base categories that were not learned during training as novel categories.
Hardware Specification	Yes	Training is conducted on 4 NVIDIA Ge Force RTX 3090 GPUs.
Software Dependencies	No	The paper provides details on training parameters, optimizers, and model architecture, but it does not specify version numbers for any software libraries, frameworks, or programming languages used (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Training begins with the detection components, using a batch size of 2 for 33 epochs, a learning rate of 4e-5 that decays by a factor of 10 at the 20th epoch. Next, the dual-branch decoders and the updater are trained with a batch size of 1 for 16 epochs, starting with a learning rate of 4e-5, which decays at the 13th epoch. Multi-frame training is employed, progressively increasing the number of frames from 2 to 3, 4, and 5 at the 4th, 7th, and 14th epochs, respectively. The hyperparameter τisol, the threshold for the matrix D, is set to a multiple of its mean value due to its variability. Table 12 and Table 13 list comprehensive hyper-parameters used in the detection and tracking training phases, respectively.