MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation
Authors: Edoardo Barba, Luigi Procopio, Niccolò Campolungo, Tommaso Pasini, Roberto Navigli
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Backed by several experiments, we provide empirical evidence that our automatically created datasets are of a higher quality than those generated by other competitors and lead a supervised model to achieve state-of-the-art performances in all multilingual Word Sense Disambiguation tasks. We make our datasets available for research purposes at https: //github.com/Sapienza NLP/mulan. |
| Researcher Affiliation | Academia | Edoardo Barba , Luigi Procopio , Niccol o Campolungo , Tommaso Pasini and Roberto Navigli Sapienza NLP Group, Department of Computer Science, Sapienza University of Rome {barba,procopio,campolungo,pasini,navigli}@di.uniroma1.it |
| Pseudocode | No | The paper describes the steps of the approach (Vectorization, Candidate Production, Dataset Generation) but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | We make our datasets available for research purposes at https: //github.com/Sapienza NLP/mulan. At https://github.com/Sapienza NLP/mulan we release about 800K sentences with more than 1.4M sense-tagged instances in Italian, Spanish, French and German. The provided link is for the datasets, not the source code for the methodology itself. |
| Open Datasets | Yes | As labeled corpus Γ, we use the concatenation of Sem Cor [Miller et al., 1993] and WNG [Langone et al., 2004] since it is the largest available corpus annotated with senses. As unlabeled corpus Θ, on the other hand, we use Wikipedia... |
| Dataset Splits | Yes | As validation set, due to the lack of any publicly available sets in the languages being considered, we reserved a small random percentage from the training set for this purpose only. |
| Hardware Specification | No | The authors wish to thank Babelscape (http://babelscape. com) for providing the computing facilities that made it possible for this work to be carried out. No specific hardware details (e.g., GPU/CPU models) are provided. |
| Software Dependencies | No | Specifically, we use m-BERT to encode the input word pieces into latent vectors... We used FAISS [Johnson et al., 2019] in order to cope with the large number of comparisons to perform. The paper mentions software like m-BERT and FAISS but does not provide specific version numbers for them or any other software dependencies. |
| Experiment Setup | Yes | During training, rather than finetuning all the model parameters, we keep the BERT weights fixed and let the gradient flow through the last layer only. The model is trained for 50 epochs, with early stopping technique set with a patience parameter of 3; we used the Adam optimizer with learning rate fixed at 2 10 5 and a cross-entropy loss criterion. |