Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient and Private Marginal Reconstruction with Local Non-Negativity
Authors: Brett Mullins, Miguel Fuentes, Yingtai Xiao, Daniel Kifer, Cameron Musco, Daniel R. Sheldon
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we measure the utility of GRe M-MLE and GRe M-LNN by incorporating them as a post-processing step into two mechanisms for privately answering marginals: (1) Residual Planner [6], and (2) a data-dependent mechanism we call Scalable MWEM. Both mechanisms measure queries with Gaussian noise and reconstruct answers to all three-way marginals for the given data domain. |
| Researcher Affiliation | Academia | 1University of Massachusetts, Amherst 2Penn State University |
| Pseudocode | Yes | Algorithm 1 Residual Planner reconstruction; Algorithm 2 Residuals-to-Marginals (Re M); Algorithm 3 Gaussian Re M with Maximum Likelihood Estimation (GRe M-MLE); Algorithm 4 Efficient Marginal Pseudoinversion (EMP); Algorithm 6 GRe M-LNN Dual Ascent; Algorithm 7 Scalable MWEM |
| Open Source Code | Yes | Our code is available at https://github.com/bcmullins/efficient-marginal-reconstruction. |
| Open Datasets | Yes | Titanic [23], Adult [24], Salary [25], and Nist-Taxi [26] |
| Dataset Splits | No | The paper states it uses four datasets and runs five trials, but does not specify train/validation/test splits with percentages or counts for these datasets. |
| Hardware Specification | Yes | All experiments were run on an internal compute cluster with two CPU cores and 20GB of memory. |
| Software Dependencies | No | The paper mentions software used in a general sense (e.g., 'standard optimizers'), but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | For the Residual Planner experiments in Section 5.1, we set the hyperparameters as follows: the maximum number of rounds T = 4000, the Lagrangian initialization parameter λ = 1, and the step size s = 0.1. For the Scalable MWEM experiments in Section 5.2, we set the hyperparameters as follows: the maximum number of rounds T = 1000, the Lagrangian initialization parameter λ = 1, the step size s = 0.02, and regularization weight η = 40. |