Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
You can't handle the (dirty) truth: Data-centric Insights Improve Pseudo-Labeling
Authors: Nabeel Seedat, Nicolas Huynh, Fergus Imrie, Mihaela van der Schaar
DMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now empirically investigate multiple aspects of DIPS... We evaluate the effectiveness of DIPS on 12 different real-world tabular datasets... We explore an extension of DIPS to images, highlighting its versatility. Setup. We investigate the use of DIPS to improve pseudo-labeling for CIFAR-10N (Wei et al., 2022a). |
| Researcher Affiliation | Academia | Nabeel Seedat EMAIL University of Cambridge, Cambridge, UK; Nicolas Huynh EMAIL University of Cambridge, Cambridge, UK; Fergus Imrie EMAIL University of California, Los Angeles, CA, USA; Mihaela van der Schaar EMAIL University of Cambridge, Cambridge, UK |
| Pseudocode | Yes | Algorithm 1 Plug DIPS into any pseudo-labeler |
| Open Source Code | Yes | 1. https://github.com/seedatnabeel/DIPS or https://github.com/vanderschaarlab/DIPS |
| Open Datasets | Yes | Datasets. The tabular datasets are drawn from a variety of domains (e.g. healthcare, finance)... For example, Covid-19 (Baqui et al., 2020), MAGGIC (Pocock et al., 2013), SEER (Duggan et al., 2016), and CUTRACT (Prostate Cancer PCUK, 2019) are medical datasets. COMPAS (Angwin et al., 2016) is a recidivism dataset. Credit is a financial default dataset from a Taiwan bank (Yeh and Lien, 2009). Higgs is a physics dataset (Baldi et al., 2014)... We investigate the use of DIPS to improve pseudo-labeling for CIFAR-10N (Wei et al., 2022a). |
| Dataset Splits | Yes | We report results in Fig. 4 across 50 random seeds with different data splits with a fixed proportion of Dlab : Dunlab of 0.1:0.9. |
| Hardware Specification | Yes | Figure 8: (a) DIPS improves the time efficiency (hours reported on a v100 GPU) of Fix Match, by 1.5-4X for the same performance ( better). |
| Software Dependencies | No | The paper mentions using 'XGBoost backbone' and 'Wide Res Net-28' but does not specify version numbers for these or any other software dependencies. Specific version numbers are required for a reproducible description of ancillary software. |
| Experiment Setup | No | The paper mentions a fixed proportion of Dlab : Dunlab of 0.1:0.9, using 50 random seeds for tabular datasets, and nlab = 1000 over three seeds for image datasets, and specifies model architectures like XGBoost and Wide ResNet-28. However, it does not explicitly provide specific hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings in the main text. |