Decoding Layer Saliency in Language Transformers

Authors: Elizabeth Mary Hou, Gregory David Castanon

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We adapt gradient-based saliency methods for these networks, propose a method for evaluating the degree of semantic coherence of each layer, and demonstrate consistent improvement over numerous other methods for textual saliency on multiple benchmark classification datasets. Our approach requires no additional training or access to labelled data, and is comparatively very computationally efficient. In this section, we detail our experimental results on two benchmark classification task datasets
Researcher Affiliation Industry Elizabeth M. Hou 1 Gregory Castanon 1 1STR, 600 West Cummings Park, Woburn, MA 01801, USA.
Pseudocode Yes We show pseudocode for our approach in Algorithm 1 which takes as inputs a tokenized input sequence t, a layer choice l, a threshold τ for the number of contributions from top ranked tokens, a feature aggregation function for a saliency method g( ), a LM task head from a pre-trained model f lm( ), and a fine-tuned classification model m( ) where mbase l ( ) is the output of layer l in the transformer stack and mclass( ) is the classification task head.
Open Source Code No The paper does not provide any explicit links to open-source code for their proposed method.
Open Datasets Yes SST-2 (Socher et al., 2013): This is a two class version of the Stanford sentiment analysis corpus... AG News (Zhang et al., 2015): This is a four class version of a corpus collected by (Gulli, 2005) from over 2000 news sources.
Dataset Splits Yes SST-2... It is split to have 67,349 training samples, 872 validation samples, and 1821 test samples; however the test labels are not publicly available and the validation set is commonly used for experiments in numerous paper including this one. AG News... It is split to have 120,000 training samples and 7,600 test samples.
Hardware Specification No The paper does not specify any hardware details like GPU or CPU models used for the experiments. It only mentions using a 'RoBERTa base from Hugging Face'.
Software Dependencies No The paper mentions 'RoBERTa base from Hugging Face (Wolf et al., 2020)', 'Allen NLP Interpret (Wallace et al., 2019)', and 'Thermo Stat (Feldhus et al., 2021) Python packages' but does not provide specific version numbers for these software components.
Experiment Setup No Section A.1 is titled 'Experimental Setup' and describes the datasets and models used. However, it does not provide specific hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings needed to reproduce the training setup.