Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Supervised Heterogeneous Domain Adaptation via Random Forests

Authors: Sanatan Sukhija, Narayanan C Krishnan, Gurkanwal Singh

IJCAI 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on three diverse datasets of varying dimensions and sparsity to verify the superiority of the proposed approach over other baseline and state of the art transfer approaches.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, Indian Institute of Technology Ropar, Punjab, India EMAIL, EMAIL 2Department of Computer Science and Engineering, PEC University of Technology, Chandigarh, India EMAIL
Pseudocode	Yes	Algorithm 1 Supervised HDA via Random Forests (SHDA-RF)
Open Source Code	No	The paper does not provide concrete access to source code for the described methodology. No repository links or explicit statements about code availability are found.
Open Datasets	Yes	The CASAS dataset [Cook et al., 2013a] is a collection of smart home datasets that are widely used for investigating activity recognition algorithms. The 20 Newsgroups [Lang, 1995] text collection is a sparse dataset... The Statlog (Landsat Satellite) [Lichman, 2013] image dataset comprises of 6 classes and 36 real-valued features.
Dataset Splits	Yes	The target training set consists of approximately 7000 samples that preserve the original class distribution. 16 such random subsets are used for evaluating the performance of the different algorithms. Target training data is created by randomly selecting 10 samples per class.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers, such as library or framework versions, needed to replicate the experiment.
Experiment Setup	Yes	The number of trees in the random forest was set to 100. The number of bagged features for learning in a tree in the forest was set to d + 5, where d is the total number of features. The parameters for the SVM model with RBF kernel were ﬁne-tuned using grid search. Based on cross validation experiments, the length of ECOC was set to 35, beyond which the performance plateaued.