Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Data Acquisition via Experimental Design for Data Markets

Authors: Charles Lu, Baihe Huang, Sai Praneeth Karimireddy, Praneeth Vepakomma, Michael Jordan, Ramesh Raskar

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed method for data acquisition (DAVED) against common data valuation methods on both synthetic data and four real-world medical: 1. Fitzpatrick17K [24], a skin lesion dataset, where the task is to predict Fitzpatrick skin tone on a 6-point scale from dermatology images. 2. RSNA Pediatric Bone Age dataset [25], where the task is to assess bone age (in months) from X-ray images of an infant s hand. 3. Medical Information Mart for Intensive Care (MIMIC-III) [31], where the task is to predict the length of hospital stay from 48 attributes such as demographics, insurance, and medical conditions. 4. Drug Lib reviews [34], text reviews of drugs where the task is to predict ratings (1-10). For validation-based methods, we use a validation set of 100 datapoints. We report mean test errors over 100 buyers.
Researcher Affiliation	Academia	Charles Lu MIT Baihe Huang UC Berkeley Sai Praneeth Karimireddy USC, UC Berkeley Praneeth Vepakomma MBZUAI, MIT Michael I. Jordan UC Berkeley Ramesh Raskar MIT
Pseudocode	Yes	Algorithm 1 DAVED: Iterative Optimization Procedure
Open Source Code	Yes	Our code is available at this repo: https://github.com/clu5/ data-acquisition-via-experimental-design. For reproducibility, our full implementation is available at: https://github.com/clu5/ data-acquisition-via-experimental-design.
Open Datasets	Yes	The RSNA Pediatric Bone Age Challenge (2017) dataset [25] may be downloaded here https://www.rsna.org/rsnai/ai-image-challenge/ rsna-pediatric-bone-age-challenge-2017. The Fitzpatrick17K [24] can be downloaded from here https://github.com/mattgroh/ fitzpatrick17k. The MIMIC dataset [31] can be accessed here https://physionet.org/content/ mimiciii/1.4/. The Drug Lib dataset [34] can be downloaded here https://archive.ics.uci.edu/ dataset/461/drug+review+dataset+druglib+com.
Dataset Splits	Yes	For validation-based methods, we use a validation set of 100 datapoints. Validation split: 100 points for baseline data valuation methods
Hardware Specification	Yes	We conduct all experiments on an Intel Xeon E5-2620 CPU with 40 cores and a Nvidia GTX 1080 Ti GPU.
Software Dependencies	Yes	For implementation of baseline data valuation methods, we use the Open Data Val package [30] version 1.2.1.
Experiment Setup	Yes	In our experiments, we use the following setting of hyperparameters for DAVED: 500 iterations for multi-step variant, 1 iteration for single-step variance Line search for step size α (0, 0.9) Regularization λ = 0 (unless otherwise specified) No early stopping. For each test point, we train a linear regression model on the selected seller points and report test mean squared error (MSE) on the buyer s data and average test error over 100 buyers.