Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient
Authors: Xiaoxia Wu, Zhewei Yao, Minjia Zhang, Conglong Li, Yuxiong He
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a very comprehensive systematic study to measure the impact of many key hyperparameters and training strategies from previous works. |
| Researcher Affiliation | Industry | Microsoft EMAIL |
| Pseudocode | No | The paper describes its methods in prose and figures, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code is released as a part of https://github.com/microsoft/DeepSpeed |
| Open Datasets | Yes | All these evaluations are performed with the General Language Understanding Evaluation (GLUE) benchmark [51] |
| Dataset Splits | Yes | We report results on the development sets after compressing a pre-trained model (e.g., BERTbase and Tiny BERT) using the corresponding single-task training data. |
| Hardware Specification | No | The paper states 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] Those are in main text.' However, specific hardware details like GPU/CPU models or explicit cloud instance types are not found in the main text. |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) in the main text. |
| Experiment Setup | Yes | We consider three budgets listed in Table 1, which cover the practical scenarios of short, standard, and long training time... Meanwhile, we also perform a grid search of peak learning rates {2e-5, 1e-4, 5e-4}. For more training details on iterations and batch size per iteration, please see Table C.1. |