Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Demystifying Singular Defects in Large Language Models
Authors: Haoqi Wang, Tong Zhang, Mathieu Salzmann
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate these ๏ฌndings on a variety of LLMs, including LLa MA2 (Touvron et al., 2023), Phi3 (Abdin et al., 2024), MPT (Team, 2023), Pythia (Biderman et al., 2023), Vicuna1.5 (Platzer & Puschner, 2021), Falcon2 (Malartic et al., 2024), GPT2 (Radford et al., 2019), Qwen2.5 (Team, 2024), to name a few. |
| Researcher Affiliation | Academia | 1School of Computer and Communication Sciences, EPFL, Switzerland 2University of Chinese Academy of Sciences, China 3Swiss Data Science Center, Switzerland. |
| Pseudocode | No | The paper describes methods using mathematical formulations and textual explanations, but it does not contain any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | Code is released at https://github. com/haoqiwang/singular_defect. |
| Open Datasets | Yes | Taking LLa MA2-7B as an example, we extract the hidden states of 1K random rows from the Wiki Text2-v1 dataset (Merity et al., 2017) across all layers and compute the norm of each token in each layer. |
| Dataset Splits | Yes | Taking LLa MA2-7B as an example, we extract the hidden states of 1K random rows from the Wiki Text2-v1 dataset (Merity et al., 2017) across all layers and compute the norm of each token in each layer. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to conduct its experiments or analysis. |
| Software Dependencies | No | The paper mentions various LLMs and quantization techniques but does not specify the versions of key software libraries (e.g., Python, PyTorch, CUDA) used for its own experimental setup. |
| Experiment Setup | No | The paper describes the analytical methodology and observed phenomena in LLMs. While it discusses aspects like quantization strategies, it does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings for reproducing experiments. |