Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training Set Debugging Using Trusted Items
Authors: Xuezhou Zhang, Xiaojin Zhu, Stephen Wright
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on toy and real data demonstrate that our approach can identify training set bugs effectively and suggest appropriate changes to the labels. |
| Researcher Affiliation | Academia | Xuezhou Zhang and Xiaojin Zhu and Stephen Wright EMAIL Department of Computer Sciences, University of Wisconsin-Madison |
| Pseudocode | Yes | Algorithm 1: DUTI |
| Open Source Code | Yes | All code and data are published at http://pages.cs.wisc.edu/~jerryzhu/DUTI. |
| Open Datasets | Yes | We study the UCI German Loan data set, which has been used in recent work on algorithmic fairness (Zemel et al. 2013; Feldman et al. 2015). Another dataset often used in algorithmic fairness is UCI Adult Income (Kohavi 1996; Kamishima, Akaho, and Sakuma 2011). In this section, we evaluate the debugging methods on a 10-class handwritten digit recognition problem (Mathworks 2017). |
| Dataset Splits | Yes | In all our experiments, the learner s hyperparameters are set by 10-fold cross validation on the original training data, and con๏ฌdence levels on all trusted items c are set to 100. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | No | The paper states that 'the learner s hyperparameters are set by 10-fold cross validation', but it does not provide concrete hyperparameter values (e.g., learning rate, batch size, specific kernel parameters) or detailed training configurations in the main text. |