reproducibilityindex.ai

Adversarial Moment-Matching Distillation of Large Language Models

Authors: Chen Jia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results from both task-agnostic instruction-following experiments and task-specific experiments demonstrate the effectiveness of our method and achieve new state-of-the-art performance. Empirically, we evaluate our approach on both the instruction-following dataset and three task-specific datasets for text summarization, machine translation, and commonsense reasoning.
Researcher Affiliation	Industry	Chen Jia SI-TECH Information Technology jiachenwestlake@gmail.com
Pseudocode	Yes	Algorithm 1: Adversarial training procedure (Page 5)
Open Source Code	Yes	The code and implementation are released at https://github.com/jiachenwestlake/MMKD.
Open Datasets	Yes	We construct the training data from databricks-dolly-15k [8], where we randomly select 15K samples for training and equally split 500 samples for validation and testing. we also add the Open Web Text [13] corpus. For the text summarization task, we follow Ko et al. [21] to conduct experiments on the SAMSum [12] dataset. For the machine translation tasks, we follow Ko et al. [21] to conduct experiments on the IWSLT 17 (en-de) [5] dataset. For the commonsense reasoning task, we conduct experiments on the Strategy QA dataset [11].
Dataset Splits	Yes	We construct the training data from databricks-dolly-15k [8], where we randomly select 15K samples for training and equally split 500 samples for validation and testing.
Hardware Specification	Yes	We use NVIDIA A40 GPUs with 40GB RAM to conduct all the experiments. (Appendix B.1)
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, etc.).
Experiment Setup	Yes	More details on experimental setup refer to Appendix B. More details about the experimental setup refer to Appendix B. (Tables 3 and 4 in Appendix B list detailed hyperparameters such as Max. Step Size, Inner Step Size, Batch Size, Learning Rate, etc.)