Robust High-Dimensional Classification From Few Positive Examples
Authors: Deepayan Chakrabarti, Benjamin Fauber
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate DIRECT on several real-world datasets spanning document, image, and medical classification. DIRECT is up to 5x 7x better than SMOTE-like methods, 30 200% better than ensemble methods, 3x 7x better than cost-sensitive methods. The greatest gains are for settings with the fewest samples in the minority class, where DIRECT s robustness is most helpful. |
| Researcher Affiliation | Collaboration | Deepayan Chakrabarti1 , Benjamin Fauber2 1University of Texas, Austin 2Dell Inc. |
| Pseudocode | Yes | Algorithm 1 DIRECT |
| Open Source Code | Yes | Our code is available at https://github.com/deepayan12/direct. |
| Open Datasets | Yes | We ran experiments on six text, two image, and one medical dataset, along with 20 UCI datasets (Table 1). Table 1 lists specific datasets like '20-Newsgroups', 'Reuters', 'MNIST (digits)', 'UCI (20 datasets)'. The Tumors dataset refers to '[Yeang et al., 2001]'. |
| Dataset Splits | No | In each experiment, we created a training set with nlo positive and nhi negative samples that were randomly chosen from the dataset. All remaining datapoints were used for testing. The paper states DIRECT 'does not need cross-validation' and does not describe using a separate validation set for its own experiments. |
| Hardware Specification | No | The paper mentions evaluating 'Wall-Clock Time' but provides no specific details about the hardware (e.g., CPU, GPU, memory, specific models) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a 'linear SVM', 'XGBoost', and 'any off-the-shelf solver', but does not provide specific version numbers for any of these software components or libraries. |
| Experiment Setup | Yes | Our proposed classifier... uses a robust kernel density to model the minority class distribution. With an appropriate choice of loss function... ℓ(y, x; θ = (c, w)) = max(0, 1 - y (c + w T x))... and a post-processing step, we adjust the intercept... min c R ... 1c +w T xi. In each experiment, we created a training set with nlo positive and nhi negative samples that were randomly chosen from the dataset. All remaining datapoints were used for testing. We ran experiments on 509 unique (dataset, class, nlo, nhi) combinations, each being repeated 30 times. |