SALSA: Attacking Lattice Cryptography with Transformers
Authors: Emily Wenger, Mingjie Chen, Francois Charton, Kristin E. Lauter
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we train transformers to perform modular arithmetic and mix half-trained models with statistical cryptanalysis techniques to propose SALSA: a machine learning attack on LWE-based cryptographic schemes. SALSA can fully recover secrets for small-to-mid size LWE instances with sparse binary secrets, and may scale to attack real-world LWE-based cryptosystems.Our paper has three main contributions. We demonstrate that transformers can perform modular arithmetic on integers and vectors. We show that transformers trained on LWE samples can be used to distinguish LWE instances from random. This can be further turned into two algorithms that recover binary secrets. We show how these techniques yield a practical attack on LWE based cryptosystems and demonstrate its efficacy in the cryptanalysis of small and mid-size LWE instances with sparse binary secrets. Our code is available at https://github.com/facebookresearch/SALSA.5 SALSA EvaluationIn this section, we present our experiments with SALSA. We generate datasets for LWE problems of different sizes, defined by the dimension and the density of ones in the binary secret. We use gated universal transformers, with two layers in the encoder and decoder. Default dimensions and attention heads in the encoder and decoder are 1024/512 and 16/4, but we vary them as we scale the problems. Models are trained on two NVIDIA Volta 32GB GPUs on an internal cluster. |
| Researcher Affiliation | Collaboration | Emily Wenger University of Chicago Mingjie Chen University of Birmingham Francois Charton Meta AI Kristin Lauter Meta AI |
| Pseudocode | Yes | Algorithm 1 Direct Secret Recovery; Algorithm 2 Distinguisher Secret Recovery |
| Open Source Code | Yes | Our code is available at https://github.com/facebookresearch/SALSA. |
| Open Datasets | No | The paper states: "We generate training data by fixing the modulus q...", indicating they generated their own dataset rather than using a pre-existing public one. No specific access information (link, DOI, formal citation) for this generated dataset is provided. |
| Dataset Splits | No | The paper mentions a "test set" and "training data" but does not specify explicit percentages or sample counts for training, validation, and test splits needed to reproduce the experiment (e.g., "80% training, 10% validation, 10% test"). |
| Hardware Specification | Yes | Models are trained on two NVIDIA Volta 32GB GPUs on an internal cluster. |
| Software Dependencies | No | The paper mentions software components like "Adam optimizer" and "transformers" but does not specify their version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | The model is trained to predict b from a, for an unknown but fixed value of s. We use sequence-to-sequence transformers [67] with one layer in the encoder and decoder, 512 dimensions and 8 attention heads. We minimize a cross-entropy loss, and use the Adam optimizer [39] with a learning rate of 5 10 5. At epoch end (300000 examples), model accuracy is evaluated over a test set of 10000 examples. We train until test accuracy is 95% or loss plateaus for 60 epochs.Our base model has two encoder layers, with 1024 dimensions and 32 attention heads, the second layer iterated 2 times, and two decoder layers with 512 dimensions and 8 heads, the second layer iterated 8 times. ... Models are trained using the Adam optimizer with lr = 10 5 and 8000 warmup steps. |