TOT:Topology-Aware Optimal Transport for Multimodal Hate Detection
Authors: Linhao Zhang, Li Jin, Xian Sun, Guangluan Xu, Zequn Zhang, Xiaoyu Li, Nayu Liu, Qing Liu, Shiyao Yan
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The newly achieved state-of-the-art performance on two publicly available benchmark datasets, together with further visual analysis, demonstrate the superiority of TOT in capturing implicit cross-modal alignment.Experiment To verify the superiority of our proposed approach, in this section we firstly describe the evaluation settings. Then we display the model performance on two benchmarks of our TOT against other state-of-the-art unimodal and multimodal approaches. Finally, we conduct qualitative studies to analyze TOT s superiority in aligning clues from multiple modalities. |
| Researcher Affiliation | Academia | Linhao Zhang1,2,3, Li Jin1,2*, Xian Sun1,2, Guangluan Xu1,2, Zequn Zhang1,2, Xiaoyu Li1,2, Nayu Liu1,2,3, Qing Liu1,2, Shiyao Yan1,2,3 1Aerospace Information Research Institute, Chinese Academy of Sciences 2Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute 3School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not include an unambiguous sentence where the authors state they are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We evaluate our model on two publicly available harmful memes detection datasets, Harm-C and Harm P, which consist of real-world memes relate to COVID-19 and US politics, respectively.Table 1: Statistics of Harm-C and Harm-P dataset. |
| Dataset Splits | Yes | Table 1: Statistics of Harm-C and Harm-P dataset. Dataset Harmfulness Train Test Valid Total Harm-C ... 3013 354 177 3544 Harm-P ... 3020 355 177 3552 |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using CLIP and certain algorithms but does not provide specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1). |
| Experiment Setup | Yes | Implementation Details For each meme, we take the dh = 512 to extract representations. We limit the max lengths of image and text feature sequence by setting n2 p = 49 and ns = 77 respectively, which adopts the same configuration as the pre-training process. For the Kernel Mapping, we take Gaussian kernel with ϵ = 0.1 and set the max numbers of sinkhorn iteration for as 3. For the topology reasoning, we take a configuration of reason step N = 3 and a dimension h = 256 for node vectors. |