Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Graphical Lasso and Thresholding: Equivalence and Closed-form Solutions

Authors: Salar Fattahi, Somayeh Sojoudi

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The developed results are demonstrated on synthetic data, functional MRI data, traﬃc ﬂows for transportation networks, and massive randomly generated data sets.
Researcher Affiliation	Academia	Salar Fattahi, EMAIL Department of Industrial Engineering and Operations Research University of California, Berkeley Somayeh Sojoudi, EMAIL Departments of Electrical Engineering and Computer Sciences and Mechanical Engineering University of California, Berkeley
Pseudocode	Yes	Algorithm 1: Warm-start algorithm
Open Source Code	No	We implemented the elementary estimator and the proposed closed-form solution in MATLAB using its sparse package.
Open Datasets	Yes	Consider the problem of estimating the brain functional connectivity network based on a set of resting state functional MRI (f MRI) data collected from 20 individual subjects (V ertes et al., 2012). The data is collected from the Caltrans Performance Measurement System (Pe MS) database, which consists of traﬃc information of freeways on the a statewide scale across California (Pe M 2017).
Dataset Splits	No	The paper mentions collecting 134 samples for fMRI data, combining datasets of 20 subjects, and constructing a 1049x1049 sample covariance matrix from 2016 data samples (288 samples for each day of the week) for traffic data. For randomly generated data, it states 'n = d/2 number of i.i.d. samples are drawn'. However, no specific training/test/validation splits or percentages are provided for any of these datasets.
Hardware Specification	Yes	We show that the proposed method can obtain an accurate approximation of the GL for instances with the sizes as large as 80, 000 80, 000 (more than 3.2 billion variables) in less than 30 minutes on a standard laptop computer running MATLAB, while other state-of-the-art methods do not converge within 4 hours.
Software Dependencies	Yes	We use the source codes for latest versions of QUIC and GLASSO in our simulations. In particular, we use the QUIC 1.1 (available in http://bigdata.ices.utexas.edu/software/1035/) which is implemented in C ++ with MATLAB interface. The GLASSO is downloaded from http://statweb.stanford.edu/~tibs/glasso/ and is implemented in FORTRAN with MATLAB interface. We implemented the elementary estimator and the proposed closed-form solution in MATLAB using its sparse package.
Experiment Setup	Yes	One can choose the value of λ to be greater than σd to ensure that the graph supp(Σres) is acyclic. In particular, if we pick λ in the interval (σd, σd−1), the graph supp(Σres) becomes a spanning tree. Select λ as 0.85 − ϵ for a sufficiently small number ϵ and consider Condition (2-ii) in Theorem 19.