reproducibilityindex.ai

GASP: Gated Attention for Saliency Prediction

Authors: Fares Abawi, Tom Weber, Stefan Wermter

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments indicate that fusion approaches achieve better results for static integration methods, whereas non-fusion approaches for which the inﬂuence of each modality is unknown result in better outcomes when coupled with recurrent models for dynamic saliency prediction. We show that gaze direction and affective representations contribute a prediction to ground-truth correspondence improvement of at least 5% compared to dynamic saliency models without social cues.
Researcher Affiliation	Academia	Fares Abawi , Tom Weber and Stefan Wermter University of Hamburg {abawi, tomweber, wermter}@informatik.uni-hamburg.de
Pseudocode	Yes	Algorithm 1 SCD sampling and generation Input: Video and audio frames sampled from ds = AVE dataset Parameters: Window sizes WSP = 15, WGE = 7, WGF = 5, WFER = 0 O/P steps T SP = 15, T GE = 4, T GF = 0, T FER = 0 Output: Modality windows mdlwin O/P buffers bufmdl 1: for vid in ds do ...
Open Source Code	Yes	Contact Author 1Code: http://software.knowledge-technology.info#gasp
Open Datasets	Yes	We train our GASP model on the social event subset of AVE [Tavakoli et al., 2020].
Dataset Splits	No	We train our GASP model on the social event subset of AVE [Tavakoli et al., 2020]. ... The models are evaluated on the test subset of social event videos in AVE.
Hardware Specification	Yes	An NVIDIA RTX 2080 Ti GPU with 11 GB VRAM and 128 GB RAM is used for training all static and sequential models. To extract spatiotemporal maps in the ﬁrst stage (SCD), we employ an NVIDIA TITAN RTX GPU with 24 GB VRAM and 64 GB RAM to accommodate all social cue detectors simultaneously.
Software Dependencies	No	The paper mentions using the Adam optimizer, but it does not specify software dependencies like programming language versions (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow) with their specific version numbers, or other libraries.
Experiment Setup	Yes	We employ the loss functions introduced by Tsiami et al. [2020], assigning the loss weights λ1 = 0.1, λ2 = 2, and λ3 = 1 to cross-entropy, CC, and NSS losses respectively. The model is trained using the Adam optimizer, having a learning rate of 0.001, with β1 = 0.9 and β2 = 0.999. All models are trained for 10k iterations with a batch size of 4. ... The loss LDAM with a weight λDAM = 0.5 is computed for optimizing the inverted stream parameters.