Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression
Authors: Steve Yadlowsky, Taedong Yun, Cory Y McLean, Alexander D'Amour
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate bηSLOE in simulated data, where ground truth parameter values are known. |
| Researcher Affiliation | Industry | Steve Yadlowsky EMAIL Taedong Yun EMAIL Cory Mc Lean EMAIL Alexander D Amour EMAIL Google Research, Brain Team Google Health |
| Pseudocode | No | The paper describes the mathematical derivation and approximation for SLOE but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Code in Supplement |
| Open Datasets | Yes | The Heart Disease dataset (downloadable from the UCI Machine Learning Repository)... 136 training examples and 20 predictors (κ = 0.15). |
| Dataset Splits | No | The paper does not explicitly specify a separate 'validation' dataset split for hyperparameter tuning or model selection. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments. |
| Software Dependencies | Yes | We implemented this estimator in Python, using scikit-learn to perform the MLE, and numpy / scipy [Harris et al., 2020, Virtanen et al., 2020] for the high dimensional adjustment and inference. |
| Experiment Setup | Yes | In our simulations, we use a data generating process parameterized by the sample size n, the aspect ratio κ, and signal strength γ2. We show the results of these coverage experiments, with n = 4000. |