Massively Multilingual Sparse Word Representations
Authors: Gábor Berend
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our proposed algorithm behaves competitively to strong baselines through a series of rigorous experiments performed towards downstream applications spanning over dependency parsing, document classification and natural language inference. |
| Researcher Affiliation | Academia | G abor Berend 1,2 1 University of Szeged, Institute of Informatics Szeged, Hungary 2 MTA-SZTE RGAI, Szeged Szeged, Hungary berendg@inf.u-szeged |
| Pseudocode | Yes | Algorithm 1 Pseudocode of MAMUS |
| Open Source Code | Yes | We make our sparse embeddings for 27 languages and the source code that we used to obtain them publicly available at https://github.com/begab/mamus. |
| Open Datasets | Yes | Our primary source for evaluating our proposed representations is the massively multilingual evaluation framework from (Ammar et al., 2016b), which also includes recommended corpora to be used for training word representations for more than 70 languages. All the embeddings used in our experiments were trained over these recommended resources, which is a combination of the Leipzig Corpora Collection (Goldhahn et al., 2012) and Europarl (Koehn, 2005). |
| Dataset Splits | Yes | During the monolingual experiments, we were solely focusing on the development set for English to set the hyperparameter controlling the sparsity of the representations. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or specific computing environments with specs) used for running the experiments. It only mentions training times. |
| Software Dependencies | Yes | We implemented a simple multilayer perceptron in Py Torch v1.1 (Paszke et al., 2017) with two hidden layers employing Re LU nonlinearity. |
| Experiment Setup | Yes | We simply used the default settings of fasttext for training, meaning that the original dense word representations were 100 dimensional. We set the number of semantic atoms in the dictionary matrix D consistently as k = 1200 throughout all our experiments. Based on our monolingual evaluation results from Table 1, we decided to fix the regularization coefficient for MAMUS at λ = 0.1 for all of our upcoming mutlilingual experiments. The MLP uses the categorical cross-entropy for loss function, which was optimized by Adam (Kingma & Ba, 2014). |