DeTrack: In-model Latent Denoising Learning for Visual Object Tracking
Authors: Xinyu Zhou, Jinglun Li, Lingyi Hong, Kaixun Jiang, Pinxue Guo, Weifeng Ge, Wenqiang Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results validate the effectiveness of our approach, achieving competitive performance on several challenging datasets. |
| Researcher Affiliation | Academia | 1Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China 2Shanghai Engineering Research Center of AI & Robotics, Academy for Engineering and Technology, Fudan University, Shanghai, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks clearly labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | We will provide the code and model after the paper be accepted |
| Open Datasets | Yes | The model is trained on full dataests (COCO, GOT-10k, Tracking Net, and La SOT). |
| Dataset Splits | No | The paper mentions training on specific datasets and a testing phase, but does not explicitly provide specific details about training/validation/test dataset splits (e.g., percentages, sample counts for validation). |
| Hardware Specification | Yes | Our experiments are conducted on Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz with 252GB RAM and 8 NVIDIA Ge Force RTX 3090 GPUs with 24GB memory. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | A total of 240 epochs are trained, with the learning rate set to 8e-5 for the denoising Vi T and 8e-6 for the box refining. The learning rate decreases by a factor of 10 at the 192-th epoch. |