reproducibilityindex.ai

Multilingual Distributed Representations without Word Alignment

Authors: Unknown

ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present results from two experiments. The BICVM model was trained on 500k sentence pairs of the English-German parallel section of the Europarl corpus. We evaluate our model using the cross-lingual document classiﬁcation (CLDC) task of Klementiev et al. [16].
Researcher Affiliation	Academia	Karl Moritz Hermann and Phil Blunsom Department of Computer Science University of Oxford Oxford, OX1 3QD, UK {karl.moritz.hermann,phil.blunsom}@cs.ox.ac.uk
Pseudocode	No	The paper describes the model using equations and textual explanations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Results for other dimensionalities and our source code for our model are available at http://www.karlmoritz.com.
Open Datasets	Yes	We use the Europarl corpus (v7)1 for training the bilingual model. The corpus was pre-processed using the set of tools provided by cdec2 [9] for tokenizing and lowercasing the data. 1http://www.statmt.org/europarl/
Dataset Splits	Yes	We ran the CLDC experiments both by training on English and testing on German documents and vice versa. Using the data splits provided by [16], we used varying training data sizes from 100 to 10,000 documents for training the multiclass classiﬁer.
Hardware Specification	No	The paper does not specify any hardware details such as CPU, GPU models, or memory used for running the experiments.
Software Dependencies	No	The paper mentions using 'cdec' for pre-processing and refers to an 'averaged perceptron classiﬁer' implementation from prior work, but it does not specify any other software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other library versions).
Experiment Setup	Yes	L2 regularization (1), step-size (0.1), number of noise elements (50), margin size (50), embedding dimensionality (d=40). We use the adaptive gradient method, Ada Grad [8], for updating the weights of our models, and terminate training after 50 iterations.