reproducibilityindex.ai

Simple Weak Coresets for Non-decomposable Classification Measures

Authors: Jayesh Malaviya, Anirban Dasgupta, Rachit Chhaya

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide experimental evidence of our results on real datasets and for various classifiers and sampling techniques. Figures 1 through 4 clearly show that uniform sampling gives superior or comparable performance to other sophisticated methods for both F1 score and MCC.
Researcher Affiliation	Academia	1Indian Institute of Technology, Gandhinagar 2DA-IICT, Gandhinagar
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide a direct link to the source code for the described methodology or explicitly state its release.
Open Datasets	Yes	Data Sets: The COVERTYPE (Blackard 1998) data consists of 581, 012 cartographic observations of different forests with 54 features. The task is to predict the type of trees at each location (49% positive). The KDDCUP 99 (Stolfo et al. 1999) data comprises of 494,021 network connections with 41 features, and the task is to detect network intrusions (20% positive). The Adult (Becker and Kohavi 1996) dataset is a widely-used dataset containing information about individuals from the 1994 U.S. Census Bureau database.
Dataset Splits	No	The paper specifies training data usage but lacks explicit details on standard train/validation/test splits, including specific percentages or sample counts for each subset to ensure reproducibility of data partitioning.
Hardware Specification	Yes	All experiments were run on a computer with Nvidia Tesla V100 GPU with 32 GB memory and 28 CPUs.
Software Dependencies	No	The paper mentions Python and scikit-learn but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	For MLP experiments, we considered a simple MLP classifier with two hidden layers of size 100 each and the final output layer of size two, as we are dealing with binary classification. The optimizer used for the MLP is Adam, and the activation function used is Re LU.