Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Improving Private Random Forest Prediction Using Matrix Representation

Authors: Arisa Tajima, Joie Wu, Amir Houmansadr

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results show significant accuracy improvements of up to 40% compared to state-of-the-art methods. We validate our methods on real-world datasets, demonstrating a significant accuracy improvement of up to 40% compared to existing approaches.
Researcher Affiliation	Academia	1University of Massachusetts Amherst 2Independent Researcher EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: DP M-RF Training Algorithm 2: DP M-RF Prediction Algorithm 3: DP M-RF Framework
Open Source Code	Yes	Our code and technical appendix will be available at: https://github.com/ arisa77/mrf-public.git.
Open Datasets	Yes	Datasets. We use six popular classification datasets from the UCI ML Repository (Kelly, Longjohn, and Nottingham 2023) with feature dimensions ranging from 4 to 128: Car, Iris, Scale, Adult, Heart, and Mushroom.
Dataset Splits	Yes	Unless explicitly denoted, each dataset is split into train and test subsets with a 80:20 ratio.
Hardware Specification	Yes	All implementations are in Python and experiments were conducted on a Mac Book Air (M2 chip with 16GB RAM).
Software Dependencies	No	The paper mentions 'Python' as the implementation language and 'scikit-learn' for a baseline classifier, but does not provide specific version numbers for these software components. For example: 'All implementations are in Python' and 'the Extra-Trees classifier from scikit-learn'.
Experiment Setup	Yes	Figure 2: Test accuracy of different private prediction techniques on various datasets, varying values of privacy loss ϵ with fixed parameters:h = 4, τ = 128 for Car, h = 2, τ = 64 for Iris, h = 2, τ = 128 for Balance, h = 2, τ = 128, q = 2, d = 4 for Heart, h = 3, τ = 125, q = 5, d = 4 for Mushroom and h = 8, τ = 100, q = 4, d = 10 for Adult.