Low-Rank Constraints for Fast Inference in Structured Models

Authors: Justin Chiu, Yuntian Deng, Alexander Rush

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces while providing practical speedups.
Researcher Affiliation Academia Justin T. Chiu Cornell University jtc257@cornell.edu Yuntian Deng Harvard University dengyuntian@seas.harvard.edu Alexander M. Rush Cornell University arush@cornell.edu
Pseudocode Yes Algorithm 1 Hypergraph marginalization
Open Source Code Yes Code is available here.
Open Datasets Yes Our first set of experiments evaluate sequential models on PENN TREEBANK dataset (PTB) [Marcus et al., 1993]... The last set of experiments use HSMMs for video modeling, where we use CROSSTASK [Zhukov et al., 2019] with 10% of the training data for validation.
Dataset Splits Yes The last set of experiments use HSMMs for video modeling, where we use CROSSTASK [Zhukov et al., 2019] with 10% of the training data for validation.
Hardware Specification No Due to GPU memory constraints, we can only train HSMMs up to 28 states.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers, such as library or solver names with their corresponding versions.
Experiment Setup Yes For the full hyperparameter and optimization details, see Appendix G. We optimize all models using Adam [Kingma and Ba, 2017] with a decoupled weight decay [Loshchilov and Hutter, 2017] of 0.01. We use a constant learning rate of 0.001.