Segmental Recurrent Neural Networks
Authors: Lingpeng Kong, Chris Dyer, Noah Smith
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two sets of experiments to compare segmental recurrent neural networks against models that do not include explicit representations of segmentation. |
| Researcher Affiliation | Academia | Lingpeng Kong, Chris Dyer School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA {lingpenk, cdyer}@cs.cmu.eduNoah A. Smith Computer Science & Engineering University of Washington Seattle, WA 98195, USA nasmith@cs.washington.edu |
| Pseudocode | No | The paper describes algorithms using text and mathematical equations (e.g., dynamic programming recurrences), but no structured pseudocode or algorithm blocks are explicitly labeled or formatted as such. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the methodology described. |
| Open Datasets | Yes | We use the handwriting dataset from Kassel (1995). For the joint Chinese word segmentation and POS tagging task, we use the Penn Chinese Treebank 5 (Xue et al., 2005), following the standard train/dev/test splits. For the pure Chinese word segmentation task, we used the SIGHAN 2005 dataset. |
| Dataset Splits | Yes | The dataset is split into train, development and test set following Kassel (1995). Table 1 presents the statistics for the dataset. For the joint Chinese word segmentation and POS tagging task, we use the Penn Chinese Treebank 5 (Xue et al., 2005), following the standard train/dev/test splits. |
| Hardware Specification | No | The paper mentions "using a single CPU" when discussing training speed, but it does not specify any particular CPU model, GPU, memory, or other detailed hardware specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using "Adam (Kingma & Ba, 2014)" for optimization and "Wang2Vec (Ling et al., 2015)" for character embeddings, but it does not provide specific version numbers for these or any other software libraries or dependencies. |
| Experiment Setup | Yes | We use Adam (Kingma & Ba, 2014) with λ = 1 10 6 to optimize the parameters in the models. We used 5 as the hidden state dimension in the bidirectional RNNs, which map the points into fixedlength stroke embeddings (hence the input vector size 5 2 = 10). We set the hidden dimensions of c in our model and CTC model to 24 and segment embedding h in our model as 18. For both tasks, the dimension for the input character embedding is 64. For our model, the dimension for c and the segment embedding h is set to 24. For the baseline bi-directional LSTM tagger, we set the hidden dimension (the c equivalent) size to 128. |