Multivariate Conditional Outlier Detection and Its Clinical Application
Authors: Charmgil Hong, Milos Hauskrecht
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results on a clinical dataset obtained from Cincinnati Children s Hospital Medical Center (Pestian et al. 2007). The dataset contains 978 instances; each consists of 1,449 features (x) extracted from clinical progress notes and 45 binary response variables (y) representing the diseases diagnosed. We compare our Multivariate Conditional Outlier DEtection method (MCODE) (Hong and Hauskrecht 2015) with two state-of-the-art multivariate outlier detection methods: Local Outlier Factor (LOF) (Breunig et al. 2000) and One-class SVM (OS) (Amer, Goldstein, and Abdennadher 2013). We performed 10-fold cross validation; on each round, we perturbed 0.5% of the data by randomly flipping 1 to 5 response variables (hence, the outliers represent misdiagnoses), and evaluated how the methods identify the outliers. Figure 1 shows the results in terms of the area under the precision-recall curve (AUCPR). |
| Researcher Affiliation | Academia | Charmgil Hong and Milos Hauskrecht Department of Computer Science University of Pittsburgh Pittsburgh, PA 15260 {charmgil, milos}@cs.pitt.edu |
| Pseudocode | No | No pseudocode or algorithm blocks are provided in the paper. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We present experimental results on a clinical dataset obtained from Cincinnati Children s Hospital Medical Center (Pestian et al. 2007). |
| Dataset Splits | Yes | We performed 10-fold cross validation; on each round, we perturbed 0.5% of the data by randomly flipping 1 to 5 response variables (hence, the outliers represent misdiagnoses), and evaluated how the methods identify the outliers. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types) used for running experiments are mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned in the paper. |
| Experiment Setup | No | The paper describes the general experimental setup (e.g., 10-fold cross-validation, data perturbation) and the methods compared, but it does not provide specific hyperparameters or system-level training settings for the models used (e.g., learning rates, batch sizes, or specific parameters for LOF or One-class SVM). |