Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Identification of Approximate Best Configuration of Training in Large Datasets
Authors: Silu Huang, Chi Wang, Bolin Ding, Surajit Chaudhuri3862-3869
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments with large datasets. We demonstrate that our ABC solution is tens to hundreds of times faster, while returning top configurations with no more than 1% accuracy loss. |
| Researcher Affiliation | Collaboration | 1University of Illinois, Urbana-Champaign, IL 2Microsoft Research, Redmond, WA 3Alibaba Group, Bellevue, WA EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: ABC |
| Open Source Code | No | The paper does not provide a direct link to open-source code for its methodology or an explicit statement of code release. |
| Open Datasets | Yes | We evaluate with five large-scale machine learning benchmarks that are publicly available. |
| Dataset Splits | No | The paper states: |
| Hardware Specification | Yes | We conducted our evaluation on a VM with 8 cores and 56 GB RAM. |
| Software Dependencies | No | The paper mentions |
| Experiment Setup | Yes | The initial training sample size and testing sample size are 1000 and 2000 respectively. The geometry step size is set to be c = 2. ϵ = 0.01, δ = 0.5. |