Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Weighted L1 and L0 Regularization Using Proximal Operator Splitting Methods

Authors: Zewude A. Berkessa, Patrik Waldmann

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Moreover, we evaluate the effectiveness of our model on both simulated and real high-dimensional genomic datasets by comparing with adaptive versions of the least absolute shrinkage and selection operator (LASSO), elastic net (EN), smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). The results show that WL1L0 outperforms the LASSO, EN, SCAD and MCP by consistently achieving the lowest mean squared error (MSE) across all datasets, indicating its superior ability to handling large high-dimensional data.
Researcher Affiliation	Academia	Zewude A. Berkessa EMAIL Research Unit of Mathematical Sciences University of Oulu, Patrik Waldmann EMAIL Research Unit of Mathematical Sciences University of Oulu
Pseudocode	Yes	Hence, for WL1L0-ADMM, the updates are made in six steps, alternating between the two primal variables u and v, with corresponding dual variables m and w. The steps are: c(k+1) := prox Tv(u)γ(u(k) m(k)), u(k+1) := proxgγ(c(k+1) + m(k)), m(k+1) := m(k) + c(k+1) u(k+1), d(k+1) := prox Tu(v)δ(v(k) w(k)), v(k+1) := proxhδ(d(k+1) + w(k)), w(k+1) := w(k) + d(k+1) v(k+1).
Open Source Code	Yes	Julia code for the WL1L0-ADMM and WL1L0-SCPRSM is available at https: //github.com/Zew AB/WL1L0-ADMM-and-SCPRSM.
Open Datasets	Yes	Simulated QTLMAS 2010 Dataset (Szydlowski & Paczyńska, 2011): This dataset comprises 3226 individuals... Real Pig Dataset (Cleveland et al., 2012): This dataset contains genomic SNP data from 3534 individuals... Real Mice Dataset (Pérez & de Los Campos, 2014): This dataset contains data from 1814 individuals...
Dataset Splits	Yes	Generations 1 to 4 (individuals 1 to 2326) were used for training, and generation 5 (individuals 2327 to 3226) served as test data. [...] For the Pig dataset, we employed 5-fold cross-validation with random allocations into training and test data to obtain the minimum test MSE on the test data set, with the results averaged over the folds. [...] Similar to the Pig dataset, we employed 5-fold cross-validation also for this data.
Hardware Specification	Yes	All analyses were executed on a Linux computing platform equipped with an AMD EPYC 7302P 16-Core Processor and 32GB of system memory.
Software Dependencies	Yes	The WL1L0-ADMM, WL1L0-SCPRSM, EN-ADMM, EN-SCPRSM, LASSO-ADMM and LASSO-SCPRSM methods were implemented in Julia 1.10.1 (Bezanson et al., 2017) using the Proximal Operators package (Antonello et al., 2018). For all methods, the BO was performed with the Bayesian Optimization package using an Elastic GPE model and the squared exponential automatic relevance determination (SEArd) kernel (Fairbrother et al., 2018).
Experiment Setup	Yes	The initial values of ˆb, ˆc and ˆd were set to the marginal covariances between y and X, multiplied by 0.0001. By conducting preliminary runs for each set of hyperparameters using BO, we identified the optimal range of parameters. BO with the MI acquisition function was executed for hyperparameter tuning of all methods. The test MSE was monitored during the BO process to ensure convergence, which was indicated by no further decrease in MSE. [...] The iterations are terminated when convergence is reached according to (c(k) + d(k)) (u(k) + v(k)) β(1 + m(k) + w(k)) ) for tolerance parameter β which was set to 10 5.