Segmental Recurrent Neural Networks

Authors: Lingpeng Kong, Chris Dyer, Noah Smith

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present two sets of experiments to compare segmental recurrent neural networks against models that do not include explicit representations of segmentation.
Researcher Affiliation Academia Lingpeng Kong, Chris Dyer School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA {lingpenk, cdyer}@cs.cmu.eduNoah A. Smith Computer Science & Engineering University of Washington Seattle, WA 98195, USA nasmith@cs.washington.edu
Pseudocode No The paper describes algorithms using text and mathematical equations (e.g., dynamic programming recurrences), but no structured pseudocode or algorithm blocks are explicitly labeled or formatted as such.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the methodology described.
Open Datasets Yes We use the handwriting dataset from Kassel (1995). For the joint Chinese word segmentation and POS tagging task, we use the Penn Chinese Treebank 5 (Xue et al., 2005), following the standard train/dev/test splits. For the pure Chinese word segmentation task, we used the SIGHAN 2005 dataset.
Dataset Splits Yes The dataset is split into train, development and test set following Kassel (1995). Table 1 presents the statistics for the dataset. For the joint Chinese word segmentation and POS tagging task, we use the Penn Chinese Treebank 5 (Xue et al., 2005), following the standard train/dev/test splits.
Hardware Specification No The paper mentions "using a single CPU" when discussing training speed, but it does not specify any particular CPU model, GPU, memory, or other detailed hardware specifications used for running the experiments.
Software Dependencies No The paper mentions using "Adam (Kingma & Ba, 2014)" for optimization and "Wang2Vec (Ling et al., 2015)" for character embeddings, but it does not provide specific version numbers for these or any other software libraries or dependencies.
Experiment Setup Yes We use Adam (Kingma & Ba, 2014) with λ = 1 10 6 to optimize the parameters in the models. We used 5 as the hidden state dimension in the bidirectional RNNs, which map the points into fixedlength stroke embeddings (hence the input vector size 5 2 = 10). We set the hidden dimensions of c in our model and CTC model to 24 and segment embedding h in our model as 18. For both tasks, the dimension for the input character embedding is 64. For our model, the dimension for c and the segment embedding h is set to 24. For the baseline bi-directional LSTM tagger, we set the hidden dimension (the c equivalent) size to 128.