reproducibilityindex.ai

Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification

Authors: Mozhi Zhang, Yoshinari Fujinuma, Jordan Boyd-Graber9547-9554

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments conﬁrm that characterlevel knowledge transfer is more data-efﬁcient than word-level transfer between related languages.
Researcher Affiliation	Academia	Mozhi Zhang CS and UMIACS University of Maryland College Park, MD, USA mozhi@cs.umd.edu Yoshinari Fujinuma Computer Science University of Colorado Boulder, CO, USA fujinumay@gmail.com Jordan Boyd-Graber CS, i School, LSC, and UMIACS University of Maryland College Park, MD, USA jbg@umiacs.umd.edu Now at Google Research Zurich
Pseudocode	No	The paper describes the model architecture and training process in text and with a diagram (Figure 1), but it does not include a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any explicit statements about making its source code available or links to a code repository.
Open Datasets	Yes	Our ﬁrst dataset is Reuters multilingual corpus (RCV2), a collection of news stories labeled with four topics (Lewis et al. 2004)... We build a second CLDC dataset with famine-related documents sampled from Tigrinya (TI) and Amharic (AM) LORELEI language packs (Strassel and Tracey 2016).
Dataset Splits	No	For each language, we sample 1,500 training documents and 200 test documents with balanced labels. No explicit mention of a separate validation set.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions "Adam (Kingma and Ba 2015) with default settings" as the optimizer, but does not specify versions for other software dependencies or libraries like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	We use three ReLU layers with 100 hidden units and 0.1 dropout for the CLWE-based DAN models and the DAN classiﬁer of the CACO models. The BI-LSTM embedder uses ten dimensional character embeddings and forty hidden states with no dropout. The outputs of the embedder are forty dimensional word embeddings. We set λd to 1, λe to 0.001, and λp to 1 in the multi-task objective (Equation 11). ... All models are trained with Adam (Kingma and Ba 2015) with default settings. We run the optimizer for a hundred epochs with mini-batches of sixteen documents.