DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
Authors: Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng LI, Bo Li, Zhangyang Wang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our method presents an outstanding performance on multiple language tasks. Prompts tuned on open-source Vicuna-7b (Chiang et al., 2023) can achieve significant performance gains across 4 tasks after transfer to closed-source heterogeneous-architecture models (GPT3.5) or open-source models (Llama-2 (Touvron et al., 2023) or Vicuna-33b). Our method presents an outstanding performance on multiple language tasks. Prompts tuned on open-source Vicuna-7b (Chiang et al., 2023) can achieve significant performance gains across 4 tasks after transfer to closed-source heterogeneous-architecture models (GPT3.5) or open-source models (Llama-2 (Touvron et al., 2023) or Vicuna-33b). |
| Researcher Affiliation | Academia | 1University of Texas at Austin, 2Princeton University, 3MIT, 4University of Chicago |
| Pseudocode | Yes | Algorithm 1 DP-OPT (ϵ0 < ) or OPT (ϵ0 = ), Algorithm 2 DP-Ens Gen: Differentially-Private Ensemble Generation, Algorithm 3 Deep Language Network (DLN-1) and potential privacy leakage, Algorithm 4 Limited Domain(h; k, k, ϵ0, δ0) |
| Open Source Code | Yes | Codes are available at https://github.com/VITA-Group/DP-OPT. |
| Open Datasets | Yes | We use SST-2 from the GLUE benchmark (Wang et al., 2018) which includes 6.7 × 10^4 samples. Trec and Mpqa (Lu et al., 2021) and Disaster (Bansal et al., 2019) are smaller datasets consisting of fewer training samples. |
| Dataset Splits | Yes | The validation set is selected from the training set per random seed. The ratio of validation with respect to the training set is included in brackets. For all trainable methods, we hold out 5% of training data for validation and report accuracy on the original test set. |
| Hardware Specification | No | Portions of this research were conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing1, a composable computing cluster (He et al., 2023). This statement refers to a high-performance computing resource but does not provide specific hardware models like GPUs or CPUs. |
| Software Dependencies | No | For DPSGD, we adopt dp-transformers package to reduce the memory overhead caused by gradient clipping (Wutschitz et al., 2022) and tune the hyper-parameters for each dataset. The version number for `dp-transformers` is not specified. |
| Experiment Setup | Yes | Detailed parameters for DP-OPT are given in Table 5. As the total δ is determined by sample size, we mainly tune the ϵ0 and δ0 for each dataset such that enough tokens can be generated in DP-OPT. For DP-OPT and OPT on Mpqa, we set the repetition penalty to be 1.1 to avoid repeated words. Table 5 lists Max new tokens, Batch size, ϵ0, δ0, and temperature for each task. |