Humor Knowledge Enriched Transformer for Understanding Multimodal Humor
Authors: Md Kamrul Hasan, Sangwu Lee, Wasifur Rahman, Amir Zadeh, Rada Mihalcea, Louis-Philippe Morency, Ehsan Hoque12972-12980
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model achieves 77.36% and 79.41% accuracy in humorous punchline detection on UR-FUNNY and MUSta RD datasets achieving a new state-of-the-art on both datasets with the margin of 4.93% and 2.94% respectively. |
| Researcher Affiliation | Academia | 1 Department of Computer Science, University of Rochester, USA, 2 Language Technologies Institute, CMU, USA, 3 Computer Science & Engineering, University of Michigan, USA |
| Pseudocode | No | No explicit pseudocode or algorithm blocks (e.g., labeled "Algorithm" or formatted as code) were found in the paper. |
| Open Source Code | Yes | We provide details of the best model configurations and hyper-parameters search spaces in the supplementary material 1. In our framework, it is possible to reproduce the same experiment on K80 gpu for specific hyper-parameters and seed. 1https://github.com/matalvepu/HKT |
| Open Datasets | Yes | UR-FUNNY: The UR-FUNNY (Hasan et al. 2019) is collected from TED talk videos... MUSt ARD: Multimodal Sarcasm Detection Dataset (MUSt ARD) (Castro et al. 2019) is compiled from popular TV shows... |
| Dataset Splits | No | The paper mentions experimenting with various hyperparameters which implies the use of a validation set, but it does not specify explicit split percentages, sample counts, or reference a predefined validation split for reproducibility. |
| Hardware Specification | Yes | In our framework, it is possible to reproduce the same experiment on K80 gpu for specific hyper-parameters and seed. |
| Software Dependencies | No | The paper mentions several software components and tools (e.g., Albert, Transformer, COVAREP, Open Face 2, P2FA, Concept Net, GloVe embeddings, Adam optimizer, Linear scheduler), but it does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | Adam optimizer and Linear scheduler are used to train the HKT model. We use different learning rates for language, acoustic, visual and HCF encoders. The search space of the learning rates is {0.001, 0.0001, 0.00001, 0.000001}. Binary cross entropy is used as loss function. We experiment with {1, 2, 3, 4, 5, 6, 7, 8} encoder layers and {1, 2, 3, 4, 6} cross attention heads for the language, acoustic, visual and HCF encoders. For the Bimodal Cross Attention we experiment {1, 2} layers and {1, 2, 4} attention heads. Dropout [0.05 0.30] (uniform distribution) is used to regularize the model. |