VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression

Authors: Won Jo, Geuntaek Lim, Gwangjin Lee, Hyunwoo Kim, Byungsoo Ko, Yukyung Choi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Its efficacy is proved via extensive experiments, and we show that our approach is not only state-of-the-art in video-level approaches but also has a fast inference time despite possessing retrieval capabilities close to those of frame-level approaches.
Researcher Affiliation Collaboration Won Jo1, Geuntaek Lim1, Gwangjin Lee1, Hyunwoo Kim1, Byungsoo Ko2, Yukyung Choi1 1Sejong University 2NAVER Vision
Pseudocode No The paper includes pipeline overviews (Figure 3, Figure 4, Figure 5, Figure 6) but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code Yes Code is available at https://github.com/sejong-rcv/VVS
Open Datasets Yes VCDB (Jiang, Jiang, and Wang 2014) was used as a training dataset, and FIVR (Kordopatis-Zilos et al. 2019a) and CC WEB VIDEO (Wu et al. 2009) were used as evaluation datasets.
Dataset Splits No The paper mentions using specific datasets for training (VCDB) and evaluation (FIVR, CC WEB VIDEO) and the role of FIVR-5K for ablation studies. However, it does not explicitly provide specific percentages or counts for training, validation, and test splits within these datasets as used in their experiments, nor does it detail a specific splitting methodology in the main text. It refers to supplementary material for implementation details.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper describes various components and layers used (e.g., LN-i MAC, R-MAC, Tensor Dot, transformer encoder, S-GAP, FC layers) and implies the use of a programming language, but it does not specify version numbers for any software dependencies, libraries, or frameworks used for implementation.
Experiment Setup No The paper states that 'implementation details are covered in the supplementary material' and mentions a parameter 'α' and thresholds 'λmag' and 'λdi' but does not provide their specific values or other concrete hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings in the main text.