reproducibilityindex.ai

QDETRv: Query-Guided DETR for One-Shot Object Localization in Videos

Authors: Yogesh Kumar, Saswat Mallick, Anand Mishra, Sowmya Rasipuram, Anutosh Maitra, Roshni Ramnani

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the proposed model significantly outperforms the competitive baselines on two public benchmarks, Vid OR and Image Net-Vid VRD, extended for one-shot open-set localization tasks.
Researcher Affiliation	Collaboration	Yogesh Kumar1, Saswat Mallick1, Anand Mishra1, Sowmya Rasipuram2, Anutosh Maitra2, Roshni Ramnani2 1Indian Institute of Technology Jodhpur, India 2Accenture Labs
Pseudocode	No	The paper includes mathematical equations and descriptions of the model components but does not provide any formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about making the source code available or provide a link to a code repository for the methodology described.
Open Datasets	Yes	In this work, we employed three primary datasets: Vid OR (Shang et al. 2019b), Image Net-Vid VRD (Shang et al. 2017b), and Open Images (Kuznetsova et al. 2018). ... The UCF101 (Soomro, Zamir, and Shah 2012) dataset, featuring 13,320 videos with diverse complexities, was employed for pretraining
Dataset Splits	Yes	To facilitate our study, we have extended two existing datasets, Vid OR (Shang et al. 2019a) and Image Net-Vid VRD (Shang et al. 2017a), by splitting them into train and test sets... Dataset statistics are provided in Table 2. (Table 2 shows '#Train Videos' and '#Test Videos' counts)
Hardware Specification	Yes	We trained the model on three Nvidia-RTX A6000 GPUs.
Software Dependencies	No	Our implementation was done using the Py Torch library. (No version number is specified for PyTorch or any other software dependencies).
Experiment Setup	Yes	We pre-train and fine-tune our models using the Adam optimizer (Kingma and Ba 2015) with an initial learning rate = 1e-5. ... We train the model for 200 epochs with batch size = 350.