Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Family of Latent Variable Convex Relaxations for IBM Model 2
Authors: Andrei Simion, Michael Collins, Cliff Stein
AAAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we describe experiments using the I2CR-3 and I2CR-4 optimization problems combined with the EM algorithm for these problems. For our experiments we only used β = 1/2, but note that β can be cross-validated for optimal performance. For our alignment experiments, we used a subset of the Canadian Hansards bilingual corpus with 247,878 English French sentence pairs as training data, 37 sentences of development data, and 447 sentences of test data (Michalcea and Pederson 2003). As a second corpus, we considered a training set of 48,706 Romanian-English sentence-pairs, a development set of 17 sentence pairs, and a test set of 248 sentence pairs (Michalcea and Pederson 2003). For our SMT experiments, we choose a subset of the English-German Europarl bilingual corpus, using 274,670 sentences for training, 1,806 for development, and 1,840 for test. |
| Researcher Affiliation | Academia | Andrei Simion Columbia University IEOR Department New York, NY, 10027 EMAIL Michael Collins Columbia University Computer Science New York, NY, 10027 EMAIL Clifford Stein Columbia University IEOR Department New York, NY, 10027 EMAIL |
| Pseudocode | Yes | Figure 3: Pseudocode for T iterations of the EM Algorithm for the I2CR-4 problem. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | For our alignment experiments, we used a subset of the Canadian Hansards bilingual corpus with 247,878 English French sentence pairs as training data, 37 sentences of development data, and 447 sentences of test data (Michalcea and Pederson 2003). As a second corpus, we considered a training set of 48,706 Romanian-English sentence-pairs, a development set of 17 sentence pairs, and a test set of 248 sentence pairs (Michalcea and Pederson 2003). For our SMT experiments, we choose a subset of the English-German Europarl bilingual corpus, using 274,670 sentences for training, 1,806 for development, and 1,840 for test. |
| Dataset Splits | Yes | For our alignment experiments, we used a subset of the Canadian Hansards bilingual corpus with 247,878 English French sentence pairs as training data, 37 sentences of development data, and 447 sentences of test data (Michalcea and Pederson 2003). As a second corpus, we considered a training set of 48,706 Romanian-English sentence-pairs, a development set of 17 sentence pairs, and a test set of 248 sentence pairs (Michalcea and Pederson 2003). For our SMT experiments, we choose a subset of the English-German Europarl bilingual corpus, using 274,670 sentences for training, 1,806 for development, and 1,840 for test. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions the 'cdec system (Dyer et al. 2010)' but does not provide specific version numbers for any software dependencies required to replicate the experiment. |
| Experiment Setup | Yes | For our experiments we only used β = 1/2, but note that β can be cross-validated for optimal performance. In training IBM Model 2 we first train IBM Model 1 for 5 iterations to initialize the t parameters, then train IBM Model 2 for a further 15 iterations (Och and Ney 2003). For the I2CR models, we use 15 iterations over the training data and seed all parameters to uniform probabilities. |