reproducibilityindex.ai

On the Pitfalls of Analyzing Individual Neurons in Language Models

Authors: Omer Antverg, Yonatan Belinkov

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment with disentangling probe quality and ranking quality, by using a probe from one method with a ranking from another method, and comparing the different probe ranking combinations. We primarily experiment with the M-BERT model (Devlin et al., 2019) on 9 languages and 13 morphological attributes, from the Universal Dependencies dataset (Zeman et al., 2020). We also experiment with XLM-R (Conneau et al., 2020), and find that most of our results are similar between the models, with a few differences which we discuss. Our experiments reveal the following insights:
Researcher Affiliation	Academia	Omer Antverg Technion Israel Institute of Technology omer.antverg@cs.technion.ac.il; Yonatan Belinkov Technion Israel Institute of Technology belinkov@technion.ac.il
Pseudocode	No	The paper describes methods with mathematical formulas and textual explanations but does not include structured pseudocode blocks or algorithms.
Open Source Code	Yes	Our code is available at: https://github.com/technion-cs-nlp/Individual-Neurons Pitfalls
Open Datasets	Yes	We primarily experiment with the M-BERT model (Devlin et al., 2019) on 9 languages and 13 morphological attributes, from the Universal Dependencies dataset (Zeman et al., 2020).
Dataset Splits	Yes	We perform a sweep search on the values of β in the range [1, 12] on a dev set.
Hardware Specification	Yes	We performed our experiments on NVIDIA RTX 2080 Ti GPU.
Software Dependencies	No	As language models, we used the implementation of the transformers library (Wolf et al., 2020). The paper does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	We use the hyperparameters reported in Durrani et al. (2020) and Torroba Hennigen et al. (2020) for training the classiﬁers. For an increasing k N, we train a classiﬁer f : Hk Z to predict the task label, F(h), solely from hΠ(d)[k] (the subvector of the representation h in the top k neurons in ranking Π), ignoring the rest of the neurons. We find β = 8 to be a balanced point, and thus report test results with β = 8 in three conﬁgs in Table 1, and the rest of the conﬁgs in Appendix A.11.