Beyond Euclidean: Dual-Space Representation Learning for Weakly Supervised Video Violence Detection
Authors: Jiaxu Leng, Zhanjie Wu, Mingpi Tan, Yiran Liu, Ji Gan, Haosheng Chen, Xinbo Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments demonstrate the effectiveness of our proposed DSRL. |
| Researcher Affiliation | Academia | Jiaxu Leng1,2, Zhanjie Wu1,2, Mingpi Tan1,2, Yiran Liu1, Ji Gan1,2, Haosheng Chen1,2, Xinbo Gao 1,2 1 Chongqing University of Posts and Telecommunications, Chongqing, China 2 Chongqing Institute for Brain and Intelligence, Guangyang Bay Laboratory, Chongqing, China gaoxb@cqupt.edu.cn |
| Pseudocode | No | The paper describes the methodology through text and diagrams but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | We will open-source the code in the future. |
| Open Datasets | Yes | Datasets. Under the multimodal input setting, we follow [34, 37, 25] to conduct experiments on XD-Violence, which is the only and extremely challenging VVD dataset with multimodal information. Under the unimodal input setting, both the XD-Violence[34] and UCF-Crime[30] datasets are used to evaluate our method. |
| Dataset Splits | No | The paper mentions training and test sets (e.g., '1,610 training videos' and '290 test videos' for UCF-Crime) but does not explicitly describe a separate validation set split or how it was used. |
| Hardware Specification | Yes | We use an Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz, a NVIDIA RTX A6000 GPU to conduct experiments. |
| Software Dependencies | Yes | We use CUDA 12.2, Python 3.9.16, and Pytorch 1.12.1. |
| Experiment Setup | Yes | Our proposed method is trained for 30 epochs in total, and the batch size is 256. The initial learning rate is 0.001, which is dynamically adjusted by a cosine annealing scheduler [13]. We use Adam [14] as the optimizer without weight decay. For hyper-parameters, we set β as 0.8, γ as 1.2, α as 0.3, and dropout rate as 0.6. |