Delving into Differentially Private Transformer
Authors: Youlong Ding, Xueyang Wu, Yining Meng, Yonggang Luo, Hao Wang, Weike Pan
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our Re-Attention Mechanism on two public recommendation datasets collected from real-world scenarios: Movie Lens (Harper & Konstan, 2015) and Amazon (Mc Auley et al., 2015). The task is to predict the text item given a sequence of items as input, i.e., the most canonical use case of Transformer models. Figure 4 shows the model accuracy every five epochs during training. Table 2 and Table 3 show the best NDCG@10 and HIT@10 for all the methods on Movie Lens and Amazon. |
| Researcher Affiliation | Collaboration | Youlong Ding 1 2 Xueyang Wu 3 Yining Meng 4 Yonggang Luo 4 Hao Wang 5 Weike Pan 1 1College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China 2The Hebrew University of Jerusalem, Jerusalem, Israel 3Hong Kong University of Science and Technology, Hong Kong SAR, China 4Changan Automobile; Changan Technology Co., Ltd, Chongqing, China 5Rutgers University, New Jersey, USA. |
| Pseudocode | Yes | Algorithm 1 Phantom Clipping Parameter: Bach size B, sentence length L, vocabulary size M, candidate size M = M, embedding dimension d. Input: a S RB L M, a C RM M, e S RB L d, e C RB M d. Notation: For X Ra b c, Y Ra c d, define X Y Ra b d as (X Y )i = Xi Yi Rb d for i [a]. |
| Open Source Code | No | The paper states 'We implement our Phantom Clipping based on AWS s fast DP library, which has implemented Ghost Clipping.' and provides a link to this third-party library (https://github.com/awslabs/fast-differential-privacy). However, it does not explicitly state that the code for their specific methods (Re-Attention Mechanism and Phantom Clipping) is open-source or provide a link to their own repository. |
| Open Datasets | Yes | We conduct experiments on two public recommendation datasets collected from real-world scenarios: Movie Lens (Harper & Konstan, 2015) and Amazon (Mc Auley et al., 2015). |
| Dataset Splits | No | The paper states 'For data partitioning, the last token of each sequence is left for testing.' but does not provide explicit training, validation, and test dataset splits with specific percentages, counts, or a detailed splitting methodology for reproduction, nor does it specify how a validation set was created or used for hyperparameter tuning. |
| Hardware Specification | Yes | Figure 8a shows the maximum batch size that can fit into a Tesla V100 GPU (16 GB of VRAM). ... Figure 8b shows the training speed on a single Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'AWS s fast DP library' but does not provide specific version numbers for these or other software components used in the experiments, which is necessary for a reproducible description of ancillary software. |
| Experiment Setup | Yes | The number of epochs is set to 100. The batch size is chosen from {256, 512, 1024, 2048, 4096}. The learning rate is chosen from {10-3, 3 10-3, 5 10-3, 7 10-3, 9 10-3}. The dropout rate is 0.2 for Movie Lens and 0.5 for Amazon (due to its high sparsity). We use the Adam optimizer with a weight decay of 10-5. |