Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations
Authors: Philip Blair, Yuval Merhav, Joel Barry
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We used our methodology to create a gold-standard dataset, which we call Wiki Sem500, and evaluated multiple state-of-the-art embeddings. The results show a correlation between performance on this dataset and performance on sentiment analysis. |
| Researcher Affiliation | Industry | Philip Blair, Yuval Merhav & Joel Barry Basis Technology One Alewife Center Cambridge, MA 02140 USA {pblair,yuval,joelb}@basistech.com |
| Pseudocode | No | A full formalization of our approach is described in Appendix A. |
| Open Source Code | No | The dataset is available for download at https://github.com/belph/wiki-sem-500 |
| Open Datasets | Yes | We release the first version of our dataset, which we call Wiki Sem500, to the research community. It contains around 500 per-language cluster groups for English, Spanish, German, Chinese, and Japanese (a total of 13,314 test cases). [...] The dataset is available for download at https://github.com/belph/wiki-sem-500 |
| Dataset Splits | No | No explicit train/validation/test splits are provided for the Wiki Sem500 dataset itself or for the evaluation process described in the paper. The paper evaluates pre-trained embeddings on their dataset. |
| Hardware Specification | No | The paper evaluates various pre-trained word embeddings but does not specify the hardware used to run the evaluation experiments on the Wiki Sem500 dataset. |
| Software Dependencies | No | The paper mentions different embedding models (GloVe, CBOW, Skip-Gram) and corpora used, but does not provide specific software dependencies with version numbers for their experimental setup. |
| Experiment Setup | No | The paper describes how multi-word entities and out-of-vocabulary words were handled, and that cosine similarity was used as the similarity measure. However, it does not provide specific experimental setup details such as hyperparameters, optimization settings, or training schedules. |