Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Finding Low-Rank Matrix Weights in DNNs via Riemannian Optimization: RAdaGrad and RAdamW
Authors: Fengmiao Bian, Jinyang ZHENG, Ziyun Liu, Jianzhou Luo, Jian-Feng CAI
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of our algorithms through finetuning experiments on large language models and diffusion models. Experimental results consistently demonstrate that our algorithms provide superior performance compared to state-of-the-art methods. |
| Researcher Affiliation | Academia | 1 Department of Mathematics The Hong Kong University of Science and Technology, Hong Kong, CHINA EMAIL,EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudocode of plain RGD in Pytorch |
| Open Source Code | Yes | The code for all our experiments can be found in the supplementary materials. |
| Open Datasets | Yes | We conduct experiments across a range of tasks, including fine-tuning large language models (GPT-2 [33]), fine-tuning diffusion models (Mix-of-Show [14] and Stable Diffusion V1.5 [35]), and deep neural network (DNN) compression tasks. We validate the effectiveness of the proposed algorithms in DNNs compression on the MNIST and CIFAR10 datasets |
| Dataset Splits | Yes | To evaluate the performance of the algorithms, we randomly divide the MNIST dataset [10] into a training dataset with 50,000 samples and a test dataset with 10,000 samples. |
| Hardware Specification | Yes | The large model fine-tuning experiments are conducted on the NVIDIA A100 GPUs, while the DNN compression experiments are performed on a system equipped with an AMD Ryzen 7 7800X3D 8-core CPU and an Nvidia RTX 4090 GPU. |
| Software Dependencies | No | The pseudocode mentions 'Pytorch', but specific version numbers for software dependencies like Python or PyTorch are not provided in the paper text. |
| Experiment Setup | Yes | To ensure fairness, all hyperparameters except for the learning rate and weight decay are kept consistent, and the optimal combinations are determined via grid search (complete implementation details and hyperparameter settings can be found in Appendix C.1). |