Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A geometric framework for momentum-based optimizers for low-rank training

Authors: Steffen Schotthöfer, Timon Klein, Jonas Kusch

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our methods through numerical experiments, demonstrating stronger validation metrics at given parameter budgets. In Section 5 we underline the efficiency of the proposed method through numerical experiments.
Researcher Affiliation	Academia	Computer Science and Mathematics Division; Oak Ridge National Laboratory; Oak Ridge, TN 37831 USA; Mail correspondence: EMAIL Department of Mathematics; Otto von Guericke University Magdeburg; 39106 Magdeburg; Germany Scientific Computing; Norwegian University of Life Sciences; Drøbakveien 31, 1433 Ås; Norway
Pseudocode	Yes	Algorithm 1: Single iteration of the dynamical low-rank momentum method. Algorithm 2: Single iteration of the low-rank Adam method.
Open Source Code	No	Answer: [No] Justification: We provide the open source code upon paper acceptance
Open Datasets	Yes	UCM Data Cifar10 Data Cifar100 Data. We fine-tune the 183M parameter De BERTa V3-base transformer model [11] on the GLUE benchmark suite [35]. Llama2 7b-chat-hf on Bool Q and PIQA We compare Algorithm 2 with Lo RA on Llama-27b-chat-hf [33] across reasoning benchmarks, including Bool Q [4] and PIQA [1]... GPT2 on Open Web Text We pretrain Karpathy s reproduction5 of the 124M-parameter GPT-2 model [24] from scratch on the Open Web Text dataset [8]...
Dataset Splits	Yes	We normalize the training and validation data using channel-wise means [0.485, 0.456, 0.406] and standard deviations [0.229, 0.224, 0.225]. Convolutional neural networks (CNNs) are applied directly to the original 256 256 image resolution. For the Vision Transformer (Vi T), the input images are resized to 224 224 pixels within the data pipeline. Table 9: Summary of GLUE benchmark tasks Corpus Task #Train #Dev #Test #Label Metrics
Hardware Specification	Yes	All experiments in this paper are computed using workstation GPUs. Each training run used a single GPU, except for GPT-2 pretraining, which was performed on two NVIDIA H100 GPUs. Specifically, we have used 5 NVIDIA RTX A6000, 3 NVIDIA RTX 4090, and 2 NVIDIA H100.
Software Dependencies	No	In this paper, we use the pytorch implementation for neural network training.
Experiment Setup	Yes	Table 6: Training hyperparameters for the UCM, Cifar10, Cifar100 and Image Net1k Benchmark. Table 10: Hyper-parameter setup for the GLUE benchmark, determined by an initial hyperparameter sweep.