Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fast Private Kernel Density Estimation via Locality Sensitive Quantization
Authors: Tal Wagner, Yonatan Naamad, Nina Mishra
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that our resulting DP-KDE mechanisms are fast and accurate on large datasets in both high and low dimensions. |
| Researcher Affiliation | Industry | 1Amazon. Correspondence to: Tal Wagner <EMAIL>. |
| Pseudocode | Yes | Algorithm 1: LSQ Mechanism for DP-KDE Curator |
| Open Source Code | Yes | Our code is available online.2 (2https://github.com/talwagner/lsq) |
| Open Datasets | Yes | Covertype: forest cover types (n = 581,012, d = 55) (Blackard & Dean, 1999) Glo Ve: word embeddings (n = 1,000,000, d = 100) (Pennington et al., 2014) Diabetes: age and days in hospital (n = 101,766, d = 2) (Strack et al., 2014) NYC Taxi: longitude and latitude (n = 100,000, d = 2) (Chavez et al., 2018) |
| Dataset Splits | No | The paper mentions holding out query points but does not specify explicit training/validation/test splits (e.g., percentages, counts, or cross-validation). |
| Hardware Specification | No | The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory, or cloud instances) used for running experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | In LSQ-RFF, we parameterize the mechanism by the number of random Fourier features... In LSQ-FGT, the user selects an integer parameter ρ ≥ 1... For each dataset we tune the bandwidth according to the guidelines in prior work... Bandwidth values are tuned are such that mean KDE values are on the order of 10^-2 and their standard deviation is also on the order of 10^-2 |