DeTrack: In-model Latent Denoising Learning for Visual Object Tracking

Authors: Xinyu Zhou, Jinglun Li, Lingyi Hong, Kaixun Jiang, Pinxue Guo, Weifeng Ge, Wenqiang Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results validate the effectiveness of our approach, achieving competitive performance on several challenging datasets.
Researcher Affiliation Academia 1Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China 2Shanghai Engineering Research Center of AI & Robotics, Academy for Engineering and Technology, Fudan University, Shanghai, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks clearly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code No We will provide the code and model after the paper be accepted
Open Datasets Yes The model is trained on full dataests (COCO, GOT-10k, Tracking Net, and La SOT).
Dataset Splits No The paper mentions training on specific datasets and a testing phase, but does not explicitly provide specific details about training/validation/test dataset splits (e.g., percentages, sample counts for validation).
Hardware Specification Yes Our experiments are conducted on Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz with 252GB RAM and 8 NVIDIA Ge Force RTX 3090 GPUs with 24GB memory.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes A total of 240 epochs are trained, with the learning rate set to 8e-5 for the denoising Vi T and 8e-6 for the box refining. The learning rate decreases by a factor of 10 at the 192-th epoch.