Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Authors: Xinyu Tang, Richard Shin, Huseyin A Inan, Andre Manoel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Robert Sim

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions. Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels.
Researcher Affiliation Collaboration 1 Princeton University 2 Microsoft Semantic Machines 3 M365 Research 4 University of Washington 5 Microsoft Research
Pseudocode Yes Alg. 1 presents the pseudo code of the proposed algorithm for the first step and Fig. 3 provides a demonstration with an example.
Open Source Code Yes Our repository is located at https://github.com/microsoft/dp-few-shot-generation.
Open Datasets Yes Datasets. We study 4-way news classification AGNews (Zhang et al., 2015), 6-way question classification TREC (Voorhees & Tice, 2000), and 14-way topic classification DBPedia (Zhang et al., 2015) datasets for classification tasks. For information extraction tasks, we study a slot filling dataset MIT Movies trivia10k13 (Liu et al., 2012)
Dataset Splits No The paper specifies training and test samples for datasets in Appendix C (e.g., '30,000 training and 1,900 test samples per class' for AGNews), but it does not explicitly mention or detail a separate 'validation' split with counts or percentages.
Hardware Specification No The paper states 'We present our main results on GPT-3 Babbage using the Open AI API.' This indicates reliance on OpenAI's infrastructure and does not provide specific hardware details (e.g., GPU models, CPU, memory) used for their experiments.
Software Dependencies No The paper mentions using the 'Open AI API' but does not list specific software dependencies such as programming languages, libraries, or frameworks with their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We provide the parameters that we use for our main results in Tab. 9 and the hyperparameter search in Tab. 10-14 respectively in Appendix E.