Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Modeling Skewed Class Distributions by Reshaping the Concept Space
Authors: Kyle Feuz, Diane Cook
AAAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate ICC and analyze alternative decomposition methods on well-known machine learning datasets as well as new problems in pervasive computing. Our results indicate that ICC performs as well or better than existing approaches to handling class imbalance. |
| Researcher Affiliation | Academia | Kyle D. Feuz School of Computing Weber State University Diane J. Cook School of Electrical Engineering and Computer Science Washington State University |
| Pseudocode | No | The paper describes the process steps but does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | Our source code and binary jar files are available1as Weka add-on packages. 1http://icarus.cs.weber.edu/ kfeuz/weka/ |
| Open Datasets | Yes | Twelve of the datasets come from the UC-Irvine Machine learning repository (Lichman 2013) |
| Dataset Splits | Yes | We run 10 iterations of 3-fold cross validation for each dataset to determine which results are significant. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running experiments. |
| Software Dependencies | No | The paper states that its code is available as 'Weka add-on packages' but does not specify a version number for Weka or any other software dependencies. |
| Experiment Setup | Yes | We consider three different ways of selecting labels to decompose: ICC One, ICC Maj, and ICC All. Additionally, we have three different techniques for determining the number of clusters. The first technique, ICC Avg... The second technique, ICC Fix, uses a fixed number of clusters per class. This can be useful when a domain expert has knowledge about the classes and knows that each class is really composed of x sub-classes. ... We evaluate ICC using three different clustering algorithms: k-means++ (Arthur and Vassilvitskii 2007), Expectation Maximization Clustering, and Cascade Simple KMeans (Cali nski and Harabasz 1974). ... We run 10 iterations of 3-fold cross validation for each dataset to determine which results are significant. |