Meta-Learning Neural Bloom Filters
Authors: Jack Rae, Sergey Bartunov, Timothy Lillicrap
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments explore scenarios where set membership can be learned in one-shot with improved compression over the classical Bloom Filter. We compared the Neural Bloom Filter with three memory-augmented neural networks, the LSTM, DNC, and Memory Network, that are all able to write storage sets in one-shot. We compared the space (in bits) of the model s memory (or state) to a Bloom Filter at a given false positive rate and 0% false negative rate. The false positive rate is measured empirically over a sample of 50, 000 queries for the learned models. |
| Researcher Affiliation | Collaboration | Jack W Rae 1 2 Sergey Bartunov 1 Timothy P Lillicrap 1 2 1Deep Mind, London, UK 2Co MPLEX, Computer Science, University College London, London, UK. Correspondence to: Jack W Rae <jwrae@google.com>. |
| Pseudocode | Yes | Algorithm 1 Neural Bloom Filter; Algorithm 2 Meta-Learning Training |
| Open Source Code | No | No explicit statement found about releasing the source code for the work described in this paper, nor a direct link to a source-code repository. |
| Open Datasets | Yes | Sampling Strategies on MNIST (Section 5.2); We chose the 2.5M unique tokens in the Giga Word v5 news corpus to be our universe (Section 5.4). |
| Dataset Splits | No | No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) is provided for validation data. The paper describes a meta-learning training scheme which involves sampling tasks and sets, but does not detail a fixed train/validation/test split for a single dataset. |
| Hardware Specification | Yes | We benchmark the models on the CPU (Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz) and on the GPU (NVIDIA Quadro P6000) with models implemented in Tensor Flow without any model-specific optimizations. |
| Software Dependencies | No | The paper mentions "Tensor Flow" but does not specify a version number or other versioned libraries, which is required for reproducibility. |
| Experiment Setup | Yes | To give an example network configuration, we chose fenc to be a 3-layer CNN in the case of image inputs, and a 128-hidden-unit LSTM in the case of text inputs. We chose fw and fq to be an MLP with a single hidden layer of size 128, followed by layer normalization, and fout to be a 3-layer MLP with residual connections. We used a leaky Re LU as the non-linearity. For each model we sweep over hyper-parameters relating to model size to obtain their smallest operating size at the desired false positive rate (for the full set, see Appendix D). |