BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings
Authors: Biao Zhang, Deyi Xiong, Jinsong Su
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the effectiveness of Batt RAE, we incorporate this semantic similarity as an additional feature into a state-of-the-art SMT system. Extensive experiments on NIST Chinese-English test sets show that our model achieves a substantial improvement of up to 1.63 BLEU points on average over the baseline. |
| Researcher Affiliation | Academia | Xiamen University, Xiamen, China 3610051 Soochow University, Suzhou, China 2150062 |
| Pseudocode | No | The paper describes algorithms and procedures in narrative text and mathematical equations but does not present any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code is available at https://github.com/Deep Learn XMU/Batt RAE. |
| Open Datasets | Yes | Our parallel corpus consists of 1.25M sentence pairs extracted from LDC corpora6, with 27.9M Chinese words and 34.5M English words respectively. We trained a 5-gram language model on the Xinhua portion of the GIGAWORD corpus (247.6M English words) using SRILM Toolkit7 with modified Kneser-Ney Smoothing. |
| Dataset Splits | Yes | We used the NIST MT05 data set as the development set, and the NIST MT06/MT08 datasets as the test sets. [...] From these pairs, we further extracted 34K bilingual phrases as our development data to optimize all hyper-parameters using random search (Bergstra and Bengio 2012). |
| Hardware Specification | No | The paper does not specify any hardware details like CPU models, GPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions using 'SRILM Toolkit', 'Word2Vec', and 'lib LBFGS' but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | Finally, we set ds=dt=da=dsem=50, α=0.125 (such that, β=0.875), λL=1e 5, λrec=λatt=1e 4 and λsem=1e 3 according to experiments on the development data. Additionally, we set the maximum number of iterations in the L-BFGS algorithm to 100. |