Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Bounded Space Differentially Private Quantiles
Authors: Daniel Alabi, Omri Ben-Eliezer, Anamay Chaturvedi
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Daniel Alabi EMAIL Columbia University Omri Ben-Eliezer EMAIL Massachusetts Institute of Technology Anamay Chaturvedi EMAIL Northeastern University |
| Pseudocode | Yes | Algorithm 1: DPExp GK: Exponential Mechanism DP Quantiles : High Level Description Data: X = (x1, x2, . . . , xn) Input: ϵ, α (approximation parameter), q [0, 1] (quantile parameters), δu (sensitivity) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It states "We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets." but no link or explicit statement of code release is found. |
| Open Datasets | Yes | We repeat our investigation of the utility and space complexity comparison between DPExp GKGumb and DPExp Full with the following real-world data sets (Dua & Graff, 2017): (1) Taxi Service Trajectory: A dataset from the UCI machine learning repository describing trajectories performed by all 442 taxis (at the time) in the city of Porto in Portugal (Moreira-Matias et al., 2013). (2) Gas Sensor Dataset: A UCI repository dataset containing recordings of 16 chemical sensors exposed to varying concentrations of two gas mixtures (Fonollosa et al., 2015). |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. It mentions evaluating on synthetic and real-world datasets and conducting '100 trials' but does not specify how the data was partitioned for these evaluations. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. It mentions implementing and evaluating algorithms but does not specify the computing environment or components. |
| Software Dependencies | No | The paper does not provide specific software dependencies or their version numbers. It mentions implementing algorithms but does not list any programming languages, libraries, or frameworks with version details. |
| Experiment Setup | No | The paper mentions varying parameters such as 'stream length n', 'Approximation Parameter α', and 'Data Distribution' in Section 6, but it does not specify concrete hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, optimizers) for its experiments. |