Tempered Sigmoid Activations for Deep Learning with Differential Privacy
Authors: Nicolas Papernot, Abhradeep Thakurta, Shuang Song, Steve Chien, Úlfar Erlingsson9312-9321
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate empirically the superior performance of tempered sigmoids, we show how using tempered sigmoids instead of Re LU activations significantly improves a model’s private-learning suitability and achievable privacy/accuracy tradeoffs. We advance the state-of-the-art of deep learning with differential privacy for MNIST, Fashion MNIST, and CIFAR10. On these datasets, we find For fixed privacy guarantees ε < 3, we achieve 98.1% test accuracy (instead of 96.1%) on MNIST, 86.1% (instead of 81.9%) on Fashion MNIST, and 66.2% (instead of 61.6% on CIFAR10. |
| Researcher Affiliation | Industry | 1Google Brain, 2Apple {papernot, athakurta, shuangsong, schien}@google.com, ulfar@apple.com |
| Pseudocode | Yes | from jax.scipy.special import expit def temp_sigmoid(x, scale=2., inverse_temp=2., offset=1., axis=-1): return scale * expit(inverse_temp * x) - offset def elementwise(fun, **fun_kwargs): """Layer that applies a scalar function elementwise.""" init_fun = lambda rng, input_shape: (input_shape, ()) apply_fun = lambda params, inp, **kwargs: fun(inp, **fun_kwargs) return init_fun, apply_fun Tempered Sigmoid = elementwise(temp_sigmoid, axis=-1) |
| Open Source Code | Yes | Our code will be open-sourced through a pull request to the JAX repository on Git Hub, and we include the code snippet for the tempered sigmoid activation below to demonstrate the practicality of implementing the change we propose in neural architectures. |
| Open Datasets | Yes | We use three common benchmarks for differentially private ML: MNIST (Le Cun, Cortes, and Burges 1998), Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), and CIFAR10 (Krizhevsky, Hinton et al. 2009). |
| Dataset Splits | No | The paper mentions using MNIST, Fashion MNIST, and CIFAR10, which are standard benchmarks, but does not explicitly state the specific training, validation, or test split percentages or sample counts used for these datasets within the text. |
| Hardware Specification | Yes | All of our experiments are performed with the JAX framework in Python, on a machine equipped with a 5th generation Intel Xeon processor and NVIDIA V100 GPU acceleration. |
| Software Dependencies | No | The paper mentions 'JAX framework in Python' and 'TensorFlow Privacy library' but does not specify version numbers for these software components. |
| Experiment Setup | Yes | For both MNIST and Fashion MNIST, we use a convolutional neural network whose architecture is described in Table 1. For CIFAR10, we use the deeper model in Table 2. |