Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning

Authors: Junsoo Oh, Jerry Song, Chulhee Yun

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to support our findings, using NVIDIA RTX A6000 GPUs. We perform experiments in our setting described in Section 2. We also provide empirical results using a real-world dataset MNIST.
Researcher Affiliation	Academia	Junsoo Oh Jerry Song Chulhee Yun KAIST EMAIL
Pseudocode	No	The paper describes update rules for model parameters (Equations 1 and 2) but does not present them within a structured pseudocode or algorithm block.
Open Source Code	No	We do not provide code in supplemental material. However, our results in synthetic data and MNIST data can be easily reproduced since we opened all details.
Open Datasets	Yes	We also provide empirical results using a real-world dataset MNIST.
Dataset Splits	Yes	We first train the weak model using nwk = 5000 true-labeled data points. ... We use three different values for the number of data points, nst = 75, 2000, 20000. ... Then, we train the strong model using labels predicted by the trained weak model, with varying numbers of training samples nst = 500, 1000, 1500, 2000, 2500.
Hardware Specification	Yes	We conduct experiments to support our findings, using NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The paper mentions using "stochastic gradient descent" and "full-batch Adam optimizer with default parameters" but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	The training is conducted for 1000 epochs using stochastic gradient descent with batch size 256 and learning rate η = 0.1... We use the strong model with m = 50 filters and an initialization scale σ0 = 0.01. We train the strong model using stochastic gradient descent with batch size 256 and learning rate η = 0.1... We train the strong model for 2000 training epochs when nst = 75 or nst = 2000, and for 10000 epochs when nst = 20000... We train each model for 300 epochs using the full-batch Adam optimizer with default parameters.