Leveraging Monolingual Data for Crosslingual Compositional Word Representations
Authors: Hubert Soyer, Pontus Stenetorp, and Akiko Aizawa
ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on a well-established crosslingual document classification task and achieve results that are either comparable, or greatly improve upon previous state-of-the-art methods. Concretely, our method reaches a level of 92.7% and 84.4% accuracy for the English to German and German to English sub-tasks respectively. |
| Researcher Affiliation | Academia | Hubert Soyer National Institute of Informatics, Tokyo, Japan soyer@nii.ac.jp Pontus Stenetorp University of Tokyo, Tokyo, Japan pontus@stenetorp.se Akiko Aizawa National Institute of Informatics, Tokyo, Japan aizawa@nii.ac.jp |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our implementation is available at https://github.com/ogh/binclusion |
| Open Datasets | Yes | Like Klementiev et al. (2012) we choose Euro Parl v7 (Koehn, 2005) as our bilingual corpus and leverage the English and German parts of the RCV1 and RCV2 corpora as monolingual resources. |
| Dataset Splits | No | The paper mentions tuning hyperparameters on "held out documents" but does not provide specific details on the size or percentages of a validation split. It states the test set size, but not a validation set. |
| Hardware Specification | No | The paper mentions training on a "single-core desktop computer" but does not provide specific hardware details such as CPU model, GPU model, or memory specifications. |
| Software Dependencies | No | The paper mentions software like "NLTK" and "cdec decoder" and the programming language "Julia" but does not specify version numbers for any of these components. |
| Experiment Setup | Yes | We tuned all hyperparameters of our model and explored learning rates around 0.2, mini-batch sizes around 40,000, hinge loss margins around 40 (since our vector dimensionality is 40) and λ (regularization) around 1.0. |