Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving Private Random Forest Prediction Using Matrix Representation
Authors: Arisa Tajima, Joie Wu, Amir Houmansadr
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results show significant accuracy improvements of up to 40% compared to state-of-the-art methods. We validate our methods on real-world datasets, demonstrating a significant accuracy improvement of up to 40% compared to existing approaches. |
| Researcher Affiliation | Academia | 1University of Massachusetts Amherst 2Independent Researcher EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: DP M-RF Training Algorithm 2: DP M-RF Prediction Algorithm 3: DP M-RF Framework |
| Open Source Code | Yes | Our code and technical appendix will be available at: https://github.com/ arisa77/mrf-public.git. |
| Open Datasets | Yes | Datasets. We use six popular classification datasets from the UCI ML Repository (Kelly, Longjohn, and Nottingham 2023) with feature dimensions ranging from 4 to 128: Car, Iris, Scale, Adult, Heart, and Mushroom. |
| Dataset Splits | Yes | Unless explicitly denoted, each dataset is split into train and test subsets with a 80:20 ratio. |
| Hardware Specification | Yes | All implementations are in Python and experiments were conducted on a Mac Book Air (M2 chip with 16GB RAM). |
| Software Dependencies | No | The paper mentions 'Python' as the implementation language and 'scikit-learn' for a baseline classifier, but does not provide specific version numbers for these software components. For example: 'All implementations are in Python' and 'the Extra-Trees classifier from scikit-learn'. |
| Experiment Setup | Yes | Figure 2: Test accuracy of different private prediction techniques on various datasets, varying values of privacy loss ϵ with fixed parameters:h = 4, τ = 128 for Car, h = 2, τ = 64 for Iris, h = 2, τ = 128 for Balance, h = 2, τ = 128, q = 2, d = 4 for Heart, h = 3, τ = 125, q = 5, d = 4 for Mushroom and h = 8, τ = 100, q = 4, d = 10 for Adult. |