reproducibilityindex.ai

MERGE: Fast Private Text Generation

Authors: Zi Liang, Pinghui Wang, Ruofei Zhang, Nuo Xu, Shuo Zhang, Lifeng Xing, Haitao Bai, Ziyang Zhou

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that MERGE achieves a 26.5x speedup to the vanilla encrypted model under the sequence length 512, and reduces 80% communication cost, with an up to 10x speedup to state-of-the-art approximated models.
Researcher Affiliation	Academia	MOE KLINNS Lab, Xi an Jiaotong University, Xi an 710049, P. R. China {liangzid, zs412082986, xlf20200926, haitao.bai, dakandao}@stu.xjtu.edu.cn, phwang@mail.xjtu.edu.cn, rfzhang@gmail.com, nxu@sei.xjtu.edu.cn
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Source code of experiments can be found here: https://github.com/liangzid/MERGE.
Open Datasets	Yes	We evaluate MERGE on three representative text generation tasks, including Multiwoz (Eric et al. 2020), a human-human multi-turn task-oriented dialogue corpus, Daily Dialog (Li et al. 2017), a multi-turn chitchat dataset, and Common Gen (Lin et al. 2020), a hard-constrained controlled text generation benchmark.
Dataset Splits	No	The paper mentions training parameters and datasets, but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification	Yes	All experiments above are on a single 32 GB Nvidia Tesla V100 GPU. Following previous works (Li et al. 2022), for the experiments of private inference, we use two 32 GB Nvidia Tesla V100 GPUs to simulate the client and the server, with 10 Gb E Ethernet bandwidth.
Software Dependencies	No	The paper mentions 'huggingface Transformers', 'Crypten', and 'Py Torch' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We trained all models under the learning rate 3 × 10^−5, batch size 4 with 3 epochs... train MERGE with 50, 000 steps under the learning rate 8 × 10^−5. We set the dropout rate to 0.6, λ to 0.75, and noise to 0.75.