Learning Approximate Inference Networks for Structured Prediction
Authors: Lifu Tu, Kevin Gimpel
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 7 EXPERIMENTS In Sec. 7.1 we compare our approach to previous work on training SPENs for MLC. We compare accuracy and speed, finding our approach to outperform prior work. We then perform experiments with sequence labeling tasks in Sec. 7.2. Table 1: Test F1 when comparing methods on multi-label classification datasets. |
| Researcher Affiliation | Academia | Lifu Tu Kevin Gimpel Toyota Technological Institute at Chicago, Chicago, IL, 60637, USA {lifu,kgimpel}@ttic.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks, explicitly labeled or formatted as code-like procedures, were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code for the methodology described, nor does it include a direct link to a code repository. |
| Open Datasets | Yes | We use the MLC datasets used by Belanger & Mc Callum (2016): Bibtex, Delicious, and Bookmarks. Dataset statistics are shown in Table 7 in the Appendix. For Twitter part-of-speech (POS) tagging, we use the annotated data from Gimpel et al. (2011) and Owoputi et al. (2013) which contains L = 25 POS tags. |
| Dataset Splits | Yes | For validation, we use the 500-tweet OCT27TEST set and for testing we use the 547-tweet DAILY547 test set. For Bookmarks, we use the same train/dev/test split as (Belanger & Mc Callum, 2016). |
| Hardware Specification | No | The paper mentions 'NVIDIA Corporation for donating GPUs used in this research' in the Acknowledgments, but does not specify any exact GPU models, CPU models, or other detailed hardware specifications for running the experiments. |
| Software Dependencies | No | The paper mentions using optimizers like Adam and libraries/tools like word2vec and GloVe, but does not provide specific version numbers for these software components or any other ancillary software dependencies. |
| Experiment Setup | Yes | We pretrain the feature networks F(x) by minimizing independent-label cross entropy for 10 epochs using Adam (Kingma & Ba, 2014) with learning rate 0.001. We tune λ (the L2 regularization strength for Θ) over the set {0.01, 0.001, 0.0001}. The classification threshold τ is chosen from [0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75]. |