Direct Uncertainty Prediction for Medical Second Opinions

Authors: Maithra Raghu, Katy Blumer, Rory Sayres, Ziad Obermeyer, Bobby Kleinberg, Sendhil Mullainathan, Jon Kleinberg

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show this both with a theoretical result, and on extensive evaluations on a large scale medical imaging application.
Researcher Affiliation Collaboration 1Department of Computer Science, Cornell University 2Google Brain 3UC Berkeley School of Public Health 4Chicago Booth School of Business.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We show this both with a theoretical result, and on extensive evaluations on a large scale medical imaging application. ... We train DUP and UVC models on different mixtures of Gaussians... Another empirical demonstration is given by training DUP and UVC to predict label agreement in an image blurring setting. For a source image in SVHN or CIFAR-10...
Dataset Splits Yes The DUP and UVC models are trained and evaluated using a train/test split on T, Ttrain, Ttest. This split is constructed using the patient ids of the xi T, with 20% of patient ids being set aside to form Ttest and 80% to form Ttrain (of which 10% is sometimes used as a validation set.)
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions 'Inception-v3 base' and 'small neural network' but does not specify any software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes All models rely on an Inception-v3 base that, following prior work (Gulshan et al., 2016), is initialized with pretrained weights on Image Net. The simplest UVC is Histogram E2E, the same model used in (Gulshan et al., 2016). We improved this baseline by instead taking the prelogit embeddings of Histogram-E2E, and training a small neural network (fully connected, two hidden layers width 300) with temperature scaling (as in (Guo et al., 2017)) only on xi with multiple labels.