How recurrent networks implement contextual processing in sentiment analysis

Authors: Niru Maheswaranathan, David Sussillo

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply these methods to understand RNNs trained on sentiment classification. This analysis reveals inputs that induce contextual effects, quantifies the strength and timescale of these effects, and identifies sets of these inputs with similar properties. Additionally, we analyze contextual effects related to differential processing of the beginning and end of documents. Using the insights learned from the RNNs we improve baseline Bag-of-Words models with simple extensions that incorporate contextual modification, recovering greater than 90% of the RNN s performance increase over the baseline.
Researcher Affiliation Industry 1Google Research, Brain Team, Mountain View, California, USA. Correspondence to: Niru Maheswaranathan <nirum@google.com>, David Sussillo <sussillo@google.com>.
Pseudocode No The paper describes equations and concepts, but does not provide any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide a link to a code repository.
Open Datasets Yes We apply these methods to understand RNNs trained on sentiment classification. [...] We turn our attention now to natural language, studying our best performing RNN, a GRU, trained to perform sentiment classification on the Yelp 2015 dataset (Zhang et al., 2015).
Dataset Splits Yes We selected hyperparameters (learning rate, learning rate decay, momentum, an ℓ2 regularization penalty, and dropout rate) via a validation set using random search.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions training models using Adam optimizer (Kingma & Ba, 2014) and using TensorFlow for a toolbox in the bibliography (Golub & Sussillo, 2018), but does not specify version numbers for any software dependencies.
Experiment Setup Yes We selected hyperparameters (learning rate, learning rate decay, momentum, an ℓ2 regularization penalty, and dropout rate) via a validation set using random search. We found dropout directly on the input words to be a useful regularizer for the more powerful models.