Robust High-Dimensional Classification From Few Positive Examples

Authors: Deepayan Chakrabarti, Benjamin Fauber

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate DIRECT on several real-world datasets spanning document, image, and medical classification. DIRECT is up to 5x 7x better than SMOTE-like methods, 30 200% better than ensemble methods, 3x 7x better than cost-sensitive methods. The greatest gains are for settings with the fewest samples in the minority class, where DIRECT s robustness is most helpful.
Researcher Affiliation Collaboration Deepayan Chakrabarti1 , Benjamin Fauber2 1University of Texas, Austin 2Dell Inc.
Pseudocode Yes Algorithm 1 DIRECT
Open Source Code Yes Our code is available at https://github.com/deepayan12/direct.
Open Datasets Yes We ran experiments on six text, two image, and one medical dataset, along with 20 UCI datasets (Table 1). Table 1 lists specific datasets like '20-Newsgroups', 'Reuters', 'MNIST (digits)', 'UCI (20 datasets)'. The Tumors dataset refers to '[Yeang et al., 2001]'.
Dataset Splits No In each experiment, we created a training set with nlo positive and nhi negative samples that were randomly chosen from the dataset. All remaining datapoints were used for testing. The paper states DIRECT 'does not need cross-validation' and does not describe using a separate validation set for its own experiments.
Hardware Specification No The paper mentions evaluating 'Wall-Clock Time' but provides no specific details about the hardware (e.g., CPU, GPU, memory, specific models) used for running the experiments.
Software Dependencies No The paper mentions using a 'linear SVM', 'XGBoost', and 'any off-the-shelf solver', but does not provide specific version numbers for any of these software components or libraries.
Experiment Setup Yes Our proposed classifier... uses a robust kernel density to model the minority class distribution. With an appropriate choice of loss function... ℓ(y, x; θ = (c, w)) = max(0, 1 - y (c + w T x))... and a post-processing step, we adjust the intercept... min c R ... 1c +w T xi. In each experiment, we created a training set with nlo positive and nhi negative samples that were randomly chosen from the dataset. All remaining datapoints were used for testing. We ran experiments on 509 unique (dataset, class, nlo, nhi) combinations, each being repeated 30 times.