Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provably Accurate Double-Sparse Coding
Authors: Thanh V. Nguyen, Raymond K. W. Wong, Chinmay Hegde
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we corroborate our theory with several numerical experiments on simulated data, suggesting that our method may be useful for problem sizes encountered in practice. Keywords: Sparse coding, provable algorithms, unsupervised learning |
| Researcher Affiliation | Academia | Thanh V. Nguyen EMAIL Department of Electrical and Computer Engineering Iowa State University Ames, IA 50011, USA Raymond K. W. Wong EMAIL Department of Statistics Texas A&M University College Station, TX 77843, USA Chinmay Hegde EMAIL Department of Electrical and Computer Engineering Iowa State University Ames, IA 50011, USA |
| Pseudocode | Yes | Algorithm 1 Truncated Pairwise Reweighting Initialize L = Randomly divide p samples into two disjoint sets P1 and P2 of sizes p1 and p2 respectively While |L| < m. Pick u and v from P1 at random For every l = 1, 2, . . . , n; compute i=1 y(i), u y(i), v (y(i) l )2 Sort (be1, be2, . . . , ben) in descending order If r r s.t be(r ) O(k/mr) and be(r +1)/be(r ) < O (r/ log2 n) Let b R be set of the r largest entries of be p2 Pp2 i=1 y(i), u y(i), v y(i) b R (y(i) b R )T δ1, δ2 top singular values of c Mu,v z b R top singular vector of c Mu,v If δ1 Ω(k/m) and δ2 < O (k/m log n) If dist( z, l) > 1/ log n for any l L Update L = L {z} Return A0 = (L1, . . . , Lm) |
| Open Source Code | Yes | Matlab implementation of our algorithms is available online3. https://github.com/thanh-isu/double-sparse-coding |
| Open Datasets | No | We generate a synthetic training dataset according to the model described in Section 2. |
| Dataset Splits | No | The paper states: "For each Monte Carlo trial, we uniformly draw p samples, feed these samples to the four different algorithms, and observe their ability to reconstruct A ." This indicates data is generated for each trial rather than split from a predefined dataset, so no specific dataset splits are provided. |
| Hardware Specification | No | The paper does not mention any specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Matlab implementation" but does not specify a version number. It also refers to "Trainlets s implementation" for a third-party tool, but this is not for the authors' own methodology and doesn't provide version numbers for their work's dependencies. |
| Experiment Setup | Yes | For all the approaches except Trainlets, we use T = 2000 iterations for the initialization procedure, and set the number of steps in the descent stage to 25. Since Trainlets does not have a specified initialization procedure, we initialize it with a random Gaussian matrix upon which column-wise sparse thresholding is then performed. The learning step of Trainlets2 is executed for 50 iterations, which tolerates its initialization deficiency. For each Monte Carlo trial, we uniformly draw p samples, feed these samples to the four different algorithms, and observe their ability to reconstruct A . The base dictionary Φ is the identity matrix of size n = 64, and the square synthesis matrix A has a special block structure with 32 blocks. Each block is of size 2 2 and of form [1 1; 1 1] (i.e., the column sparsity of A is r = 2). The support of x is drawn uniformly over all 6-dimensional subsets of [m], and the nonzero coefficients are randomly set to 1 with equal probability. In our simulations with noise, we add Gaussian noise ε with entrywise variance σ2 ε = 0.01 to each of those above samples. |