Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

Authors: Kevin Liang, Guoyin Wang, Yitong Li, Ricardo Henao, Lawrence Carin

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments are performed on natural language processing tasks and on analysis of local field potentials (neuroscience). We demonstrate that the variants we derive from kernels perform on par or even better than traditional neural methods.
Researcher Affiliation Academia Kevin J Liang Guoyin Wang Yitong Li Ricardo Henao Lawrence Carin Department of Electrical and Computer Engineering Duke University {kevin.liang, guoyin.wang, yitong.li, ricardo.henao, lcarin}@duke.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code Yes Implementations can be found at https://github.com/kevinjliang/kernels2rnns.
Open Datasets Yes We show results for several popular document classification datasets [37] in Table 2. The AGNews and Yahoo! datasets are topic classification tasks, while Yelp Full is sentiment analysis and DBpedia is ontology classification. We also perform experiments on popular word-level language generation datasets Penn Tree Bank (PTB) [24] and Wikitext-2 [26].
Dataset Splits Yes We also perform experiments on popular word-level language generation datasets Penn Tree Bank (PTB) [24] and Wikitext-2 [26], reporting validation and test perplexities (PPL) in Table 3. In order to test the model generalizability, we perform leave-one-out cross-validation testing: data from each mouse is left out as testing iteratively while the remaining mice are used as training.
Hardware Specification Yes All experiments are run on a single NVIDIA Titan X GPU.
Software Dependencies No The paper mentions using a specific codebase (AWD-LSTM) and a technique (Layer Normalization) but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We use 300-dimensional GloVe [27] as our word embedding initialization and set the dimensions of all hidden units to 300. Layer normalization [2] is performed after the computation of the cell state ct. For the Linear Kernel w/ ot and the Linear Kernel, we set σ2 i = σ2 f = 0.5.