BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

Authors: Biao Zhang, Deyi Xiong, Jinsong Su

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the effectiveness of Batt RAE, we incorporate this semantic similarity as an additional feature into a state-of-the-art SMT system. Extensive experiments on NIST Chinese-English test sets show that our model achieves a substantial improvement of up to 1.63 BLEU points on average over the baseline.
Researcher Affiliation Academia Xiamen University, Xiamen, China 3610051 Soochow University, Suzhou, China 2150062
Pseudocode No The paper describes algorithms and procedures in narrative text and mathematical equations but does not present any formal pseudocode or algorithm blocks.
Open Source Code Yes Source code is available at https://github.com/Deep Learn XMU/Batt RAE.
Open Datasets Yes Our parallel corpus consists of 1.25M sentence pairs extracted from LDC corpora6, with 27.9M Chinese words and 34.5M English words respectively. We trained a 5-gram language model on the Xinhua portion of the GIGAWORD corpus (247.6M English words) using SRILM Toolkit7 with modified Kneser-Ney Smoothing.
Dataset Splits Yes We used the NIST MT05 data set as the development set, and the NIST MT06/MT08 datasets as the test sets. [...] From these pairs, we further extracted 34K bilingual phrases as our development data to optimize all hyper-parameters using random search (Bergstra and Bengio 2012).
Hardware Specification No The paper does not specify any hardware details like CPU models, GPU types, or memory used for the experiments.
Software Dependencies No The paper mentions using 'SRILM Toolkit', 'Word2Vec', and 'lib LBFGS' but does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes Finally, we set ds=dt=da=dsem=50, α=0.125 (such that, β=0.875), λL=1e 5, λrec=λatt=1e 4 and λsem=1e 3 according to experiments on the development data. Additionally, we set the maximum number of iterations in the L-BFGS algorithm to 100.