MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation

Authors: Edoardo Barba, Luigi Procopio, Niccolò Campolungo, Tommaso Pasini, Roberto Navigli

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Backed by several experiments, we provide empirical evidence that our automatically created datasets are of a higher quality than those generated by other competitors and lead a supervised model to achieve state-of-the-art performances in all multilingual Word Sense Disambiguation tasks. We make our datasets available for research purposes at https: //github.com/Sapienza NLP/mulan.
Researcher Affiliation Academia Edoardo Barba , Luigi Procopio , Niccol o Campolungo , Tommaso Pasini and Roberto Navigli Sapienza NLP Group, Department of Computer Science, Sapienza University of Rome {barba,procopio,campolungo,pasini,navigli}@di.uniroma1.it
Pseudocode No The paper describes the steps of the approach (Vectorization, Candidate Production, Dataset Generation) but does not provide structured pseudocode or algorithm blocks.
Open Source Code No We make our datasets available for research purposes at https: //github.com/Sapienza NLP/mulan. At https://github.com/Sapienza NLP/mulan we release about 800K sentences with more than 1.4M sense-tagged instances in Italian, Spanish, French and German. The provided link is for the datasets, not the source code for the methodology itself.
Open Datasets Yes As labeled corpus Γ, we use the concatenation of Sem Cor [Miller et al., 1993] and WNG [Langone et al., 2004] since it is the largest available corpus annotated with senses. As unlabeled corpus Θ, on the other hand, we use Wikipedia...
Dataset Splits Yes As validation set, due to the lack of any publicly available sets in the languages being considered, we reserved a small random percentage from the training set for this purpose only.
Hardware Specification No The authors wish to thank Babelscape (http://babelscape. com) for providing the computing facilities that made it possible for this work to be carried out. No specific hardware details (e.g., GPU/CPU models) are provided.
Software Dependencies No Specifically, we use m-BERT to encode the input word pieces into latent vectors... We used FAISS [Johnson et al., 2019] in order to cope with the large number of comparisons to perform. The paper mentions software like m-BERT and FAISS but does not provide specific version numbers for them or any other software dependencies.
Experiment Setup Yes During training, rather than finetuning all the model parameters, we keep the BERT weights fixed and let the gradient flow through the last layer only. The model is trained for 50 epochs, with early stopping technique set with a patience parameter of 3; we used the Adam optimizer with learning rate fixed at 2 10 5 and a cross-entropy loss criterion.