Clickbait Detection via Contrastive Variational Modelling of Text and Label
Authors: Xiaoyuan Yi, Jiarui Zhang, Wenhao Li, Xiting Wang, Xing Xie
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three clickbait detection datasets show our method s robustness to inadequate and biased labels, outperforming several recent strong baselines. |
| Researcher Affiliation | Collaboration | 1Microsoft Research Asia 2Tsinghua University |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We conduct experiments on three clickbait-related datasets. News Clickbait Detection (News). A public Kaggle competition dataset for news headline clickbait detection2. 2https://www.kaggle.com/c/clickbait-news-detection. Tweet Clickbait Detection (Tweet). A multi-modal dataset for the Tweet posts clickbait detection competition3. 3https://webis.de/events/clickbait-challenge. News Headline Incongruence Detection (NELA). An automatically constructed dataset for detecting incongruity between a given news headline and body text [Yoon et al., 2019]. |
| Dataset Splits | Yes | Dataset Training validation Testing News 17,538 (23%) 1,500 (33%) 3,063 (33%) Tweet 17,588 (22%) 2,000 (25%) 17,554 (21%) NELA 50,000 (51%) 6,690 (51%) 6,745 (51%) |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using pre-trained models like Uni LM and BERT, but it does not specify software dependencies with version numbers (e.g., specific versions of deep learning frameworks or libraries). |
| Experiment Setup | Yes | The label embedding size, latent variable size, number of latent samples K, batch size and learning rate are 64, 256, 16, 24 and 2e-4, respectively. We use cyclic annealing [Fu et al., 2019] to alleviate the KL annealing problem in VAE training. |