Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data

Authors: Nabeel Seedat, Jonathan Crabbé, Ioana Bica, Mihaela van der Schaar

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally demonstrate the benefits of Data-IQ on four real-world medical datasets.
Researcher Affiliation Academia Nabeel Seedat University of Cambridge EMAIL Jonathan Crabbé University of Cambridge EMAIL Ioana Bica University of Oxford The Alan Turing Institute EMAIL Mihaela van der Schaar University of Cambridge The Alan Turing Institute UCLA EMAIL
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes See footnotes 2 and 3. 2 https://github.com/seedatnabeel/Data-IQ 3 https://github.com/vanderschaarlab/Data-IQ
Open Datasets Yes We conduct experiments on four real-world medical datasets... (1) Covid-19 dataset of Brazilian patients [38], (2) Prostate cancer datasets from both the US [39] and UK [40], (3) Support dataset of seriously ill hospitalized adults [41], (4) Fetal state dataset of cardiotocography [42].
Dataset Splits Yes All models are trained to convergence, with early stopping on a validation set.
Hardware Specification Yes Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix B
Software Dependencies No The paper does not provide specific software dependencies with version numbers in the main text or the ethics checklist.
Experiment Setup Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix B, detailing all relevant information