Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning

Authors: Lemao Liu, Andrew Finch, Masao Utiyama, Eiichiro Sumita

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments were performed on two standard sequence-to-sequence transduction tasks: machine transliteration and grapheme-to-phoneme transformation.
Researcher Affiliation Academia Lemao Liu, Andrew Finch, Masao Utiyama, Eiichiro Sumita National Institute of Information and Communications Technology (NICT) 3-5 Hikari-dai, Seika-cho, Soraku-gun, Kyoto, Japan {lmliu,first.last}@nict.go.jp
Pseudocode No The paper describes methods and algorithms in prose and equations, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our toolkit is publicly available on https://github.com/lemaoliu/Agtarbidir.
Open Datasets Yes For grapheme-to-phoneme (GM-PM) conversion, the standard CMUdict5 data sets were used: the original training set was randomly split into our training set (about 110000 sequence pairs) and development set (2000 pairs); the original test set consisting of about 12000 pairs was used for testing. (Footnote 5: http://www.speech.cs.cmu.edu/cgi-bin/cmudict) The transliteration training, development and test sets were taken from Wikipedia inter-language link titles4: the training data consisted of 59000 sequence pairs... (Footnote 4: www.dumps.wikimedia.org)
Dataset Splits Yes For grapheme-to-phoneme (GM-PM) conversion, the standard CMUdict5 data sets were used: the original training set was randomly split into our training set (about 110000 sequence pairs) and development set (2000 pairs); the original test set consisting of about 12000 pairs was used for testing. For the machine transliteration task, we conducted both Japanese-to-English (JP-EN) and English-to-Japanese (EN-JP) directional subtasks. The transliteration training, development and test sets were taken from Wikipedia inter-language link titles4: the training data consisted of 59000 sequence pairs composed of 313378 Japanese katakana characters and 445254 English characters; the development and test data were manually cleaned and each of them consisted of 1000 sequence pairs.
Hardware Specification No The paper mentions 'efficient use of a GPU' but does not provide specific hardware details such as GPU model numbers, CPU types, or memory specifications used for experiments.
Software Dependencies No The paper mentions using Theano and adadelta for training, and specifies adadelta parameters (decay rate ρ = 0.95 and constant ϵ = 10^-6), but it does not provide specific version numbers for these software components.
Experiment Setup Yes For all of the re-implemented models, the number of word embedding units and hidden units were set to 500 to match the configuration using in the NTM. We use the adadelta for training both GLSTM and proposed systems: the decay rate ρ and constant ϵ were set as 0.95 and 10 6 as suggested by (Zeiler 2012), and minibatch sizes were 16 and 64 for machine transliteration and GM-PM tasks, respectively.