Modeling Skewed Class Distributions by Reshaping the Concept Space
Authors: Kyle Feuz, Diane Cook
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate ICC and analyze alternative decomposition methods on well-known machine learning datasets as well as new problems in pervasive computing. Our results indicate that ICC performs as well or better than existing approaches to handling class imbalance. |
| Researcher Affiliation | Academia | Kyle D. Feuz School of Computing Weber State University Diane J. Cook School of Electrical Engineering and Computer Science Washington State University |
| Pseudocode | No | The paper describes the process steps but does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | Our source code and binary jar files are available1as Weka add-on packages. 1http://icarus.cs.weber.edu/ kfeuz/weka/ |
| Open Datasets | Yes | Twelve of the datasets come from the UC-Irvine Machine learning repository (Lichman 2013) |
| Dataset Splits | Yes | We run 10 iterations of 3-fold cross validation for each dataset to determine which results are significant. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running experiments. |
| Software Dependencies | No | The paper states that its code is available as 'Weka add-on packages' but does not specify a version number for Weka or any other software dependencies. |
| Experiment Setup | Yes | We consider three different ways of selecting labels to decompose: ICC One, ICC Maj, and ICC All. Additionally, we have three different techniques for determining the number of clusters. The first technique, ICC Avg... The second technique, ICC Fix, uses a fixed number of clusters per class. This can be useful when a domain expert has knowledge about the classes and knows that each class is really composed of x sub-classes. ... We evaluate ICC using three different clustering algorithms: k-means++ (Arthur and Vassilvitskii 2007), Expectation Maximization Clustering, and Cascade Simple KMeans (Cali nski and Harabasz 1974). ... We run 10 iterations of 3-fold cross validation for each dataset to determine which results are significant. |