reproducibilityindex.ai

Multivariate Conditional Outlier Detection and Its Clinical Application

Authors: Charmgil Hong, Milos Hauskrecht

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experimental results on a clinical dataset obtained from Cincinnati Children s Hospital Medical Center (Pestian et al. 2007). The dataset contains 978 instances; each consists of 1,449 features (x) extracted from clinical progress notes and 45 binary response variables (y) representing the diseases diagnosed. We compare our Multivariate Conditional Outlier DEtection method (MCODE) (Hong and Hauskrecht 2015) with two state-of-the-art multivariate outlier detection methods: Local Outlier Factor (LOF) (Breunig et al. 2000) and One-class SVM (OS) (Amer, Goldstein, and Abdennadher 2013). We performed 10-fold cross validation; on each round, we perturbed 0.5% of the data by randomly ﬂipping 1 to 5 response variables (hence, the outliers represent misdiagnoses), and evaluated how the methods identify the outliers. Figure 1 shows the results in terms of the area under the precision-recall curve (AUCPR).
Researcher Affiliation	Academia	Charmgil Hong and Milos Hauskrecht Department of Computer Science University of Pittsburgh Pittsburgh, PA 15260 {charmgil, milos}@cs.pitt.edu
Pseudocode	No	No pseudocode or algorithm blocks are provided in the paper.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	Yes	We present experimental results on a clinical dataset obtained from Cincinnati Children s Hospital Medical Center (Pestian et al. 2007).
Dataset Splits	Yes	We performed 10-fold cross validation; on each round, we perturbed 0.5% of the data by randomly ﬂipping 1 to 5 response variables (hence, the outliers represent misdiagnoses), and evaluated how the methods identify the outliers.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, processor types) used for running experiments are mentioned.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup	No	The paper describes the general experimental setup (e.g., 10-fold cross-validation, data perturbation) and the methods compared, but it does not provide specific hyperparameters or system-level training settings for the models used (e.g., learning rates, batch sizes, or specific parameters for LOF or One-class SVM).