Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Improving Private Random Forest Prediction Using Matrix Representation

Authors: Arisa Tajima, Joie Wu, Amir Houmansadr

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results show significant accuracy improvements of up to 40% compared to state-of-the-art methods. We validate our methods on real-world datasets, demonstrating a significant accuracy improvement of up to 40% compared to existing approaches.
Researcher Affiliation Academia 1University of Massachusetts Amherst 2Independent Researcher EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: DP M-RF Training Algorithm 2: DP M-RF Prediction Algorithm 3: DP M-RF Framework
Open Source Code Yes Our code and technical appendix will be available at: https://github.com/ arisa77/mrf-public.git.
Open Datasets Yes Datasets. We use six popular classification datasets from the UCI ML Repository (Kelly, Longjohn, and Nottingham 2023) with feature dimensions ranging from 4 to 128: Car, Iris, Scale, Adult, Heart, and Mushroom.
Dataset Splits Yes Unless explicitly denoted, each dataset is split into train and test subsets with a 80:20 ratio.
Hardware Specification Yes All implementations are in Python and experiments were conducted on a Mac Book Air (M2 chip with 16GB RAM).
Software Dependencies No The paper mentions 'Python' as the implementation language and 'scikit-learn' for a baseline classifier, but does not provide specific version numbers for these software components. For example: 'All implementations are in Python' and 'the Extra-Trees classifier from scikit-learn'.
Experiment Setup Yes Figure 2: Test accuracy of different private prediction techniques on various datasets, varying values of privacy loss ϵ with fixed parameters:h = 4, τ = 128 for Car, h = 2, τ = 64 for Iris, h = 2, τ = 128 for Balance, h = 2, τ = 128, q = 2, d = 4 for Heart, h = 3, τ = 125, q = 5, d = 4 for Mushroom and h = 8, τ = 100, q = 4, d = 10 for Adult.