reproducibilityindex.ai

Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-training

Authors: Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein10729-10737

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we present the most comprehensive study of cross-lingual stance detection to date: we experiment with 15 diverse datasets in 12 languages from 6 language families, and with 6 low-resource evaluation settings each. For our experiments, we build on pattern-exploiting training (PET), proposing the addition of a novel label encoder to simplify the verbalisation procedure.
Researcher Affiliation	Collaboration	Momchil Hardalov1,2, Arnav Arora1,3, Preslav Nakov1,4, Isabelle Augenstein1,3 1 Checkstep Research 2 Soﬁa University St. Kliment Ohridski , Bulgaria 3 University of Copenhagen, Denmark 4 Qatar Computing Research Institute, HBKU, Doha, Qatar
Pseudocode	No	The paper describes its method in prose and provides an architectural diagram (Figure 1), but it does not include any formal pseudocode or algorithm blocks.
Open Source Code	Yes	The datasets and code are available for research purposes: https://github.com/checkstep/senti-stance
Open Datasets	Yes	We use three types of datasets: 15 cross-lingual stance datasets (see Table 1), English stance datasets, and raw Wikipedia data automatically annotated for stance. We use the cross-lingual ones for ﬁne-tuning and evaluation, and the rest for pre-training only. ... The datasets and code are available for research purposes: https://github.com/checkstep/senti-stance
Dataset Splits	Yes	The resulting dataset contains around 300K examples, which we split into 80% for training, 10% for development, 10% for testing, thus ensuring that sentences from one article are only included in one of the data splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing instances.
Software Dependencies	No	The paper mentions software like XLM-T, XLM-RBase, Stanza, and Wikipedia Python API, but it does not provide specific version numbers for any of these dependencies.
Experiment Setup	No	The paper describes some general aspects of the training objective (e.g., BCE loss, masked language modeling), but it does not specify concrete hyperparameters like learning rate, batch size, number of epochs, or optimizer details.