Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Recovery and Generalization in Over-Realized Dictionary Learning

Authors: Jeremias Sulam, Chong You, Zhihui Zhu

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This work characterizes the surprising phenomenon that dictionary recovery can be facilitated by searching over the space of larger over-realized models. This observation is general and independent of the speciﬁc dictionary learning algorithm used. We thoroughly demonstrate this observation in practice and provide an analysis of this phenomenon by tying recovery measures to generalization bounds. In particular, we show that model recovery can be upper-bounded by the empirical risk, a model-dependent quantity and the generalization gap, reﬂecting our empirical ﬁndings. We further show that an eﬃcient and provably correct distillation approach can be employed to recover the correct atoms from the over-realized model. As a result, our meta-algorithm provides dictionary estimates with consistently better recovery of the ground-truth model. ... As a motivating example, we construct the following experimental setting. Data is sampled as described in the previous section from a ground-truth dictionary (with normalized Gaussian atoms) of size 50 70, from representations with cardinality k = 3. We construct 300 such samples for training, leaving 1000 to estimate the population statistics. As a learning algorithm, we employ ODL (Mairal et al., 2010) for 2000 iterations, which are more than suﬃcient for convergence. We employ OMP for the sparse coding step. In Figure 1a we depict the risk, or error, on both training and test sets, as a function of the number of atoms in the estimated dictionary b D, from 70 (the size of the ground-truth model) to 500. We repeat the experiment 20 times, and present the mean together with the 25% and 75% percentiles. Interestingly, both train and testing errors, shown in Figure 1a, improve with increasing dictionary size p > p within some range.
Researcher Affiliation	Academia	Jeremias Sulam EMAIL Department of Biomedical Engineering & Mathematical Institute for Data Science Johns Hopkins University Baltimore, MD 21205, USA Chong You EMAIL Department of Electrical Engineering & Computer Sciences University of California Berkeley, CA 94720-1776, USA Zhihui Zhu EMAIL Department of Electrical & Computer Engineering University of Denver Denver, CO 80210, USA
Pseudocode	Yes	Algorithm 1: Meta-algorithm for over-realized dictionary learning Data: Set of n training samples, X Rd n; dictionary size p, and sparse coding parameters (cardinality or penalty) λ; Initialization: Choose p > p and dictionary learning method Learn D(p,λ); Dictionary Learning: b D Learn D(p ,λ) ; Distillation: e D {top p-used atoms in b D}; Result: Estimated dictionary e D
Open Source Code	No	Available at spams-devel.gforge.inria.fr/. Note that ODL can accommodate diﬀerent formulations and algorithms for the sparse coding step (and not just an ℓ1 minimization), which will enable us to explore diﬀerent experimental settings.
Open Datasets	No	Data is sampled as described in the previous section from a ground-truth dictionary (with normalized Gaussian atoms) of size 50 70, from representations with cardinality k = 3. We construct 300 such samples for training, leaving 1000 to estimate the population statistics.
Dataset Splits	Yes	Data is sampled as described in the previous section from a ground-truth dictionary (with normalized Gaussian atoms) of size 50 70, from representations with cardinality k = 3. We construct 300 such samples for training, leaving 1000 to estimate the population statistics.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions 'ODL (Mairal et al., 2010)', 'OMP', 'K-SVD', and 'Lasso' as algorithms or methods used, and links to 'spams-devel.gforge.inria.fr/' for ODL. However, it does not provide specific version numbers for any software, libraries, or programming languages used in the implementation of their work.
Experiment Setup	Yes	Data is sampled as described in the previous section from a ground-truth dictionary (with normalized Gaussian atoms) of size 50 70, from representations with cardinality k = 3. We construct 300 such samples for training, leaving 1000 to estimate the population statistics. As a learning algorithm, we employ ODL (Mairal et al., 2010) for 2000 iterations, which are more than suﬃcient for convergence. We employ OMP for the sparse coding step. In Figure 1a we depict the risk, or error, on both training and test sets, as a function of the number of atoms in the estimated dictionary b D, from 70 (the size of the ground-truth model) to 500. We repeat the experiment 20 times, and present the mean together with the 25% and 75% percentiles. ... As one can see, the beneﬁts of searching over larger model deteriorates smoothly with increasing noise, both for the general over-realized model as well as for our practical distillation approach.