A Motion-aware Spatio-temporal Graph for Video Salient Object Ranking

Authors: Hao Chen, Zhu Yufei, Yongjian Deng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiment; 4.1 Experimental Setup; 4.2 Compared to State-of-the-art Methods; 4.3 Ablation Study
Researcher Affiliation Academia 1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China. 3College of Computer Science, Beijing University of Technology, Beijing, China
Pseudocode No The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block or figure.
Open Source Code Yes Our codes/models are released at https://github.com/zyf-815/VSOR/tree/main.
Open Datasets Yes we utilize the manually annotated masks provided in RVSOD to obtain instance masks and assign the saliency ranking score to each instance based on the distribution of fixation points. In this way, the instance-level annotations are generated. ... we utilize the video saliency detection dataset DAVSOD [20] to extract saliency ranking results...
Dataset Splits No we proceed to divide the remaining scenes into training and testing sets in a 4:1 ratio for both scenarios (b) and (c). The paper specifies a training and testing split but does not explicitly mention a separate validation split or its proportion.
Hardware Specification Yes Our model is implemented using Py Torch and all experiments are conducted on a NVIDIA RTX4090.
Software Dependencies No The paper states 'Our model is implemented using Py Torch' but does not specify the version number of PyTorch or other software dependencies with their versions.
Experiment Setup Yes Stochastic gradient descent (SGD) is employed to optimize the loss function. To facilitate training, we implement a warm-up strategy, commencing with an initial learning rate of 5e-3. At the 420,000 and 500,000 steps, the learning rate is reduced by a factor of 10... we set the batch size to 1 and the maximum iteration count to 200,000. For optimization, we utilize the Adam optimizer with an initial learning rate of 5e-6. At the 80,000th and 150,000th steps, the learning rate is reduced by a factor of 10.