reproducibilityindex.ai

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

Authors: Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive experiments demonstrate that this new paradigm outperforms other fusion-based methods in both the unseen class and cross-dataset settings. and To evaluate the grounding performance of our model, we conduct tests on AVS-Benchmarks and use mean intersection over union (m Io U) and F-score as the performance metrics, following previous works (Zhou et al. 2022b; Gao et al. 2023). Additionally, to assess the generalization ability, we split zero-shot and few-shot testing subsets2 based on AVSBenchmarks and VGG-SS datasets.
Researcher Affiliation	Academia	1 Gaoling School of Artificial Intelligence, Renmin University of China 2 School of Computer Science, Northwest Polytechnical University 3 LIESMARS, Wuhan University 4 College of Computer Science and Technology, Zhejiang Universityyaoting.wang@outlook.com, liuweisong@mail.nwpu.edu.cn, guangyaoli@ruc.edu.cn jian.ding@whu.edu.cn, dihu@ruc.edu.cn, xilizju@zju.edu.cn
Pseudocode	No	The paper describes the method using mathematical equations and figures, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Project page: https://github.com/Ge Wu-Lab/Generalizable Audio-Visual-Segmentation
Open Datasets	Yes	To evaluate the grounding performance of our model, we conduct tests on AVS-Benchmarks and use mean intersection over union (m Io U) and F-score as the performance metrics, following previous works (Zhou et al. 2022b; Gao et al. 2023). Additionally, to assess the generalization ability, we split zero-shot and few-shot testing subsets2 based on AVSBenchmarks and VGG-SS datasets. and AVS-Benchmarks (Zhou et al. 2022b) is a dataset specifically designed for AVS tasks. and VGG-SS (Chen et al. 2021) is a dataset designed for the AVL task performance test.
Dataset Splits	No	To evaluate the grounding performance of our model, we conduct tests on AVS-Benchmarks and use mean intersection over union (m Io U) and F-score as the performance metrics, following previous works (Zhou et al. 2022b; Gao et al. 2023). Additionally, to assess the generalization ability, we split zero-shot and few-shot testing subsets2 based on AVSBenchmarks and VGG-SS datasets. and Refer to the project page for detailed split settings.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing specifications).
Software Dependencies	No	The paper mentions models like VGGish and SAM, and techniques like contrastive learning and Fourier transform, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper describes the model architecture and learning objectives but does not explicitly state specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for reproduction.