Low-Rank Constraints for Fast Inference in Structured Models
Authors: Justin Chiu, Yuntian Deng, Alexander Rush
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces while providing practical speedups. |
| Researcher Affiliation | Academia | Justin T. Chiu Cornell University jtc257@cornell.edu Yuntian Deng Harvard University dengyuntian@seas.harvard.edu Alexander M. Rush Cornell University arush@cornell.edu |
| Pseudocode | Yes | Algorithm 1 Hypergraph marginalization |
| Open Source Code | Yes | Code is available here. |
| Open Datasets | Yes | Our first set of experiments evaluate sequential models on PENN TREEBANK dataset (PTB) [Marcus et al., 1993]... The last set of experiments use HSMMs for video modeling, where we use CROSSTASK [Zhukov et al., 2019] with 10% of the training data for validation. |
| Dataset Splits | Yes | The last set of experiments use HSMMs for video modeling, where we use CROSSTASK [Zhukov et al., 2019] with 10% of the training data for validation. |
| Hardware Specification | No | Due to GPU memory constraints, we can only train HSMMs up to 28 states. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers, such as library or solver names with their corresponding versions. |
| Experiment Setup | Yes | For the full hyperparameter and optimization details, see Appendix G. We optimize all models using Adam [Kingma and Ba, 2017] with a decoupled weight decay [Loshchilov and Hutter, 2017] of 0.01. We use a constant learning rate of 0.001. |