reproducibilityindex.ai

SCRIB: Set-Classifier with Class-Specific Risk Bounds for Blackbox Models

Authors: Zhen Lin, Lucas Glass, M. Brandon Westover, Cao Xiao, Jimeng Sun7497-7505

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validated SCRIB on several medical applications, including sleep staging on electroencephalogram (EEG) data, X-ray COVID image classiﬁcation, and atrial ﬁbrillation detection based on electrocardiogram (ECG) data. SCRIB obtained desirable class-speciﬁc risks, which are 35%88% closer to the target risks than baseline methods.
Researcher Affiliation	Collaboration	1University of Illinois at Urbana Champaign, Urbana, IL, United States 2Analytics Center of Excellence IQVIA, Boston, MA, United States 3Massachusetts General Hospital, Boston, MA, United States 4Harvard Medical School, Boston, MA, United States 5Amplitude, San Francisco, CA, United States
Pseudocode	Yes	Algorithm 1: Thresholds Finding for SCRIB Input: M RN K: model output on Svalid sorted by column. Mi,k is the i-th smallest value in {mk(x)}x Svalid, for class k. N denotes \|Svalid\|. ˆL : RK 7 R, empirical loss function on Svalid. Output: t RK: optimal thresholds for the set classiﬁer H. Algorithm: For k [K], initialize tk randomly from M ,k, and evaluate current loss l ˆL(t). repeat for k = 1 to K do Fixing tk k = k, search t k in M ,k to minimize ˆL using Quick Search (See Appendix) l k ˆL(t k) where t k := (t1, . . . , t k, . . . , t K) end for If mink [K] l k < l, update l l k and t t k until l does not improve return t
Open Source Code	No	The paper does not provide any statements or links indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	ISRUC (Sub-group 1) (Khalighi et al. 2016) is a public Polysomnographic (PSG) dataset for sleep staging. Sleep-EDF (Kemp et al. 2000; Goldberger et al. 2000) is another public dataset widely used to evaluate sleep staging models. ECG (Physio Net2017) (Clifford et al. 2017; Goldberger et al. 2000) is a public ECG dataset. X-ray dataset is constructed from two publicly available sources, COVID Chest X-ray6 and Kaggle Chest X-ray7.
Dataset Splits	Yes	We assume that data in Svalid and Stest are iid, and we can only use label information on Strain and Svalid. We are given a model m (potentially a DNN) trained on Strain: X 7 RK... 75% of the data are used to train the base classiﬁer. Excluding samples for model training, each class s sample counts are presented in Table 2. These are evenly split into validation and test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It mentions using 'Deep Learning Models' and 'Res Net-based' architectures.
Software Dependencies	No	The paper mentions using deep learning models but does not specify any software names with version numbers (e.g., Python, TensorFlow, PyTorch versions) needed for reproducibility.
Experiment Setup	Yes	To quantitatively compare different methods, we will set the target risks (r k) for SCRIB to 15% for all classes for ECG and 10% for other datasets. ... λk is set to 104 for all classes and datasets, and we use chance-ambiguity for A(H). For ISRUC and Sleep-EDF, we used a Res Net-based (He et al. 2016) with 3 Residual Blocks, each with 2 convolution layers.