Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Novel Data Representation for Effective Learning in Class Imbalanced Scenarios
Authors: Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several benchmark datasets clearly indicate the usefulness of the proposed approach over the existing state-of-the-art techniques. |
| Researcher Affiliation | Industry | Sri Harsha Dumpala, Rupayan Chakraborty and Sunil Kumar Kopparapu TCS Reseach and Innovation Mumbai, India EMAIL |
| Pseudocode | No | The paper describes its methods in narrative text and figures, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Proposed s2s L along with MLP and CSMLP techniques are implemented using Keras deep learning toolkit [KER, 2016]. There is no statement about the authors releasing their own code. |
| Open Datasets | Yes | All datasets used in this work are obtained from KEEL dataset repository [Fernandez et al., 2008]. |
| Dataset Splits | Yes | For each dataset, we use 5-fold (the folds as provided in the KEEL dataset repository are directly used) cross-validation approach to compare the performance of all the methods considered for analysis. ... Hence, at any time 80% of the data is used for training (75% as train set and 5% as validation set) and remaining 20% of the data is used for testing. The validation set is used for selecting network architecture and for hyper-parameter tuning. |
| Hardware Specification | Yes | Further, the average training time (in seconds) for convergence (using i5-3210M 3.1GHz cpu with 4-GB RAM) on Yeast6 for different techniques are: 98.7 (s2s L), 38.5 (MLP), 43.7 (CS-MLP), 213.8 (CSM), 95.3 (GSVM), 146.4 (EUSB). |
| Software Dependencies | No | Proposed s2s L along with MLP and CSMLP techniques are implemented using Keras deep learning toolkit [KER, 2016]. This mentions Keras but does not provide a specific version number for it. |
| Experiment Setup | Yes | For training s2s-MLP, we use Adam algorithm with an initial learning rate of 0.001. Binary cross-entropy is used as the cost function. The batch size and other hyper-parameters are selected considering the performance on the validation set. The number of units in the hidden layer is selected empirically by varying the hidden units from 2 to 4 d (twice the length of the input layer) |