Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dual Query: Practical Private Query Release for High Dimensional Data

Authors: Marco Gaboardi, Emilio Jesus Gallego Arias, Justin Hsu, Aaron Roth, Zhiwei Steven Wu

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Dual Query on a large collection of 3-way marginal queries on several real datasets (Figure 1) and high dimensional synthetic data. Adult and KDD99 are from the UCI repository (Bache & Lichman, 2013), and have a mixture of discrete (but nonbinary) and continuous attributes, which we discretize into binary attributes. We also use the (in)famous Netﬂix movie ratings dataset, with more than 17,000 binary attributes. We report maximum error in Figure 2, averaged over 5 runs.
Researcher Affiliation	Academia	Marco Gaboardi EMAIL University of Dundee, Dundee, Scotland, UK Emilio Jes us Gallego Arias EMAIL Justin Hsu EMAIL Aaron Roth EMAIL Zhiwei Steven Wu EMAIL University of Pennsylvania, Philadelphia, USA
Pseudocode	Yes	Algorithm 1 Dual Query
Open Source Code	No	The paper does not provide any concrete access information, such as a link to a repository or an explicit statement about releasing the source code for the methodology described.
Open Datasets	Yes	Adult and KDD99 are from the UCI repository (Bache & Lichman, 2013), and have a mixture of discrete (but nonbinary) and continuous attributes, which we discretize into binary attributes. We also use the (in)famous Netﬂix movie ratings dataset, with more than 17,000 binary attributes.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It mentions evaluating on a certain number of marginals but not how the main dataset itself was split for training, validation, or testing.
Hardware Specification	Yes	We ran the experiments on a mid-range desktop machine with a 4-core Intel Xeon processor and 12 Gb of RAM.
Software Dependencies	No	The paper mentions 'The implementation is written in OCaml, using the CPLEX constraint solver.' but does not provide specific version numbers for OCaml or CPLEX.
Experiment Setup	Yes	Rather than set the parameters as in Algorithm 1, we experiment with a range of parameters. For instance, we frequently run for fewer rounds (lower T) and take fewer samples (lower s). Heuristically, we set a timeout for each CPLEX call to 20 seconds, accepting the best current solution if we hit the timeout.