reproducibilityindex.ai

Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations

Authors: Philip Blair, Yuval Merhav, Joel Barry

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We used our methodology to create a gold-standard dataset, which we call Wiki Sem500, and evaluated multiple state-of-the-art embeddings. The results show a correlation between performance on this dataset and performance on sentiment analysis.
Researcher Affiliation	Industry	Philip Blair, Yuval Merhav & Joel Barry Basis Technology One Alewife Center Cambridge, MA 02140 USA {pblair,yuval,joelb}@basistech.com
Pseudocode	No	A full formalization of our approach is described in Appendix A.
Open Source Code	No	The dataset is available for download at https://github.com/belph/wiki-sem-500
Open Datasets	Yes	We release the first version of our dataset, which we call Wiki Sem500, to the research community. It contains around 500 per-language cluster groups for English, Spanish, German, Chinese, and Japanese (a total of 13,314 test cases). [...] The dataset is available for download at https://github.com/belph/wiki-sem-500
Dataset Splits	No	No explicit train/validation/test splits are provided for the Wiki Sem500 dataset itself or for the evaluation process described in the paper. The paper evaluates pre-trained embeddings on their dataset.
Hardware Specification	No	The paper evaluates various pre-trained word embeddings but does not specify the hardware used to run the evaluation experiments on the Wiki Sem500 dataset.
Software Dependencies	No	The paper mentions different embedding models (GloVe, CBOW, Skip-Gram) and corpora used, but does not provide specific software dependencies with version numbers for their experimental setup.
Experiment Setup	No	The paper describes how multi-word entities and out-of-vocabulary words were handled, and that cosine similarity was used as the similarity measure. However, it does not provide specific experimental setup details such as hyperparameters, optimization settings, or training schedules.