Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation

Authors: Randall Balestriero, Romain Cosentino, Sarath Shekkizhar

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results demonstrate how, even in large-scale regimes, exact theoretical results can answer practical questions in LLMs. Code: https://github.com/ Randall Balestriero/Spline LLM
Researcher Affiliation Collaboration 1Brown University, Computer Science Department 2Tenyx. Correspondence to: Randall Balestriero <rbalestr@brown.edu>, Romain Cosentino <romain@tenyx.com>, Sarath Shekkizhar <sarath@tenyx.com>.
Pseudocode Yes Listing 1. Code to use with the Llama Attention class in the modelling llama.py file of the Transformers package to obtain intrinsic dimension IDℓ ϵ(i) from Section 3.3; Listing 2. Code to use with the Llama MLP class in the modelling llama.py file of the Transformers package to obtain Eqs. (feature 1) to (feature 7).
Open Source Code Yes Code: https://github.com/ Randall Balestriero/Spline LLM
Open Datasets Yes Omni-Toxic Datasets: We use for the non-toxic samples: the concatenation of the subsampled (20, 000 samples) Pile validation dataset, with the questions from the Dolly Q&A datasets, as well as the non-toxic samples from the Jigsaw dataset (Adams et al., 2017). For the toxic samples: we use the toxic samples from the Jigaw dataset, concatenated with our hand-crafted toxic-pile dataset... Toxigen dataset (Hartvigsen et al., 2022).
Dataset Splits No The training procedure consists of using 70% of the dataset as the training set and evaluating the performance on the held-out 30% of the data. No explicit separate validation split is mentioned.
Hardware Specification No The paper mentions 'compute limitations' but does not specify the exact hardware used for running experiments (e.g., specific GPU/CPU models, memory details).
Software Dependencies Yes Our experiments are performed using the Llama2-7B model and its tokenizer ( meta-llama/Llama-2-7b-chat-hf ) available via the transformer package (v4.31.0).
Experiment Setup Yes Each sample is truncated to 1024-context length to accommodate for our compute limitations. ... No cross-validation is employed for hyper-parameter selection, and default parameters of the logistic regression and the random forest models from sklearn are used.