Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-training
Authors: Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein10729-10737
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we present the most comprehensive study of cross-lingual stance detection to date: we experiment with 15 diverse datasets in 12 languages from 6 language families, and with 6 low-resource evaluation settings each. For our experiments, we build on pattern-exploiting training (PET), proposing the addition of a novel label encoder to simplify the verbalisation procedure. |
| Researcher Affiliation | Collaboration | Momchil Hardalov1,2, Arnav Arora1,3, Preslav Nakov1,4, Isabelle Augenstein1,3 1 Checkstep Research 2 Soο¬a University St. Kliment Ohridski , Bulgaria 3 University of Copenhagen, Denmark 4 Qatar Computing Research Institute, HBKU, Doha, Qatar |
| Pseudocode | No | The paper describes its method in prose and provides an architectural diagram (Figure 1), but it does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The datasets and code are available for research purposes: https://github.com/checkstep/senti-stance |
| Open Datasets | Yes | We use three types of datasets: 15 cross-lingual stance datasets (see Table 1), English stance datasets, and raw Wikipedia data automatically annotated for stance. We use the cross-lingual ones for ο¬ne-tuning and evaluation, and the rest for pre-training only. ... The datasets and code are available for research purposes: https://github.com/checkstep/senti-stance |
| Dataset Splits | Yes | The resulting dataset contains around 300K examples, which we split into 80% for training, 10% for development, 10% for testing, thus ensuring that sentences from one article are only included in one of the data splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing instances. |
| Software Dependencies | No | The paper mentions software like XLM-T, XLM-RBase, Stanza, and Wikipedia Python API, but it does not provide specific version numbers for any of these dependencies. |
| Experiment Setup | No | The paper describes some general aspects of the training objective (e.g., BCE loss, masked language modeling), but it does not specify concrete hyperparameters like learning rate, batch size, number of epochs, or optimizer details. |