Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Solving Sparse \& High-Dimensional-Output Regression via Compression
Authors: Renyuan Li, Zhehui Chen, Guanyi Wang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, numerical results further validate the theoretical findings, showcasing the efficiency and accuracy of the proposed framework. |
| Researcher Affiliation | Collaboration | Renyuan Li Department of Industrial Systems Engineering & Management National University of Singapore EMAIL Zhehui Chen Google EMAIL Guanyi Wang Department of Industrial Systems Engineering & Management National University of Singapore EMAIL |
| Pseudocode | Yes | Algorithm 1 Projected Gradient Descent (for Second Stage) (...) Algorithm 2 Implemented Projected Gradient Descent (for Second Stage) |
| Open Source Code | Yes | The implemented code could be found on Github https://github.com/from-ryan/Solving_ SHORE_via_compression. |
| Open Datasets | Yes | We select two benchmark datasets in multi-label classification, Wiki10-31K and EURLex-4K[5] due to their sparsity property. |
| Dataset Splits | No | The paper mentions splitting the synthetic data into a training set and a testing set ('training set Stra with 80% and a testing set Stest with rest 20%'), but does not explicitly mention a separate validation set for hyperparameter tuning or model selection. |
| Hardware Specification | Yes | All experiments are conducted in Dell workstation Precision 7920 with a 3GHz 48Cores Intel Xeon CPU and 128GB 2934MHz DDR4 Memory. |
| Software Dependencies | Yes | The proposed method and other methods are solved using Py Torch version 2.3.0 and scikit-learn version 1.4.2 in Python 3.12.3. |
| Experiment Setup | Yes | Parameter setting. For synthetic data, we set input dimension d = 104, output dimension K = 2 104, and sparsity-level s = 3. We generate in total n = 3 104, i.i.d. samples (...) We select the number of rows for compressed matrix Φ by m {100, 300, 500, 700, 1000, 2000}. (...) For evaluating the proposed prediction method, Algorithm 2, we pick a fixed stepsize η = 0.9, F = RK + , and set the maximum iteration number as T = 60, and run prediction methods over the set Stest. |