Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Parameter-free HE-friendly Logistic Regression

Authors: Junyoung Byun, Woojin Lee, Jaewook Lee

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various real-world data show that our framework achieves better classification results while reducing latency by 68%, compared to the previous models. In this section, we evaluate our method using various real-world datasets. Through experiments, we argue that our method achieves better classification results compared to the existing methods with a shorter computation time.
Researcher Affiliation Academia Junyoung Byun Seoul National University Seoul, Korea EMAIL Woojin Lee Dongguk University-Seoul, Seoul, Korea EMAIL Jaewook Lee Seoul National University Seoul, Korea EMAIL
Pseudocode Yes Algorithm 1 Training Ridge Regression with Encrypted Private Variable
Open Source Code No The paper does not include an explicit statement or a link to open-source code for the described methodology.
Open Datasets Yes We used five widely used classification datasets from the UCI data repository: The adult income dataset (Adult), bank marketing dataset (Bank), Wisconsin Breast Cancer dataset (Cancer), Pima Indians Diabetes dataset (Diabetes), and Australian Credit Approval (Credit) dataset [11].
Dataset Splits No For each dataset, we randomly sampled 20% as test samples, and 20% of the other 80% were treated as plaintext data, which were used for the training of step 1. We encrypted the private variables of the remaining 60% and used them for steps 2 and 3. (Does not mention a distinct validation split).
Hardware Specification Yes All the experiments were performed on a machine equipped with 40 threads of an Intel Xeon E-2660 v3 @2.60GHz CPU processor.
Software Dependencies Yes We implemented step 1 of our framework with Python 3.6.3, using the LR module in the scikit-learn library. Other steps were implemented with C++, using HEAAN v1.1 [7] for HE.
Experiment Setup Yes For CKKS parameters, we used N = 216, q L = 21200, and P = 240. The sigmoid function approximation degree for LRHE was set to 3 because increasing the degree results in a larger multiplicative depth and less possible number of gradient descents with LHE. In addition, we observed that increasing the degree up to 7 did not significantly affect the performance of the model. The learning rate for LRHE was chosen in {0.001, 0.0001, 0.00001} to achieve the best classification performance.