Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Embedding Principle of Homogeneous Neural Network for Classification Problem

Authors: Jiahan Zhang, Yaoyu Zhang, Tao Luo

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct several experiments to justify that trajectories are preserved. Our findings offer insights into the effects of network width, parameter redundancy, and the structural connections between solutions found via optimization in homogeneous networks of varying sizes. ... Empirical Validation and Implications 3 A key prediction of our theoretical results is the principle of trajectory preservation (Theorem 5.2). To justify this claim empirically, we conducted experiments verifying this principle using discrete-time gradient descent. We trained pairs of narrow and wide homogeneous MLPs on a 2D linearly separable dataset (Exp. 1), with full implementation details deferred to Appendix D.
Researcher Affiliation Academia Jiahan Zhang1, Yaoyu Zhang1,2 , Tao Luo1,2,3 1School of Mathematical Sciences, Shanghai Jiao Tong University, 2Institute of Natural Sciences, MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China 3CMA-Shanghai, Shanghai Jiao Tong University, Shanghai, 200240, China Corresponding author: EMAIL Corresponding author: EMAIL
Pseudocode No The paper primarily presents theoretical frameworks, definitions, theorems, and proofs. It describes methods and procedures in mathematical notation and textual explanations, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes 3Code available: https://github.com/Silentmoonlight/kkt-embedding-principle
Open Datasets Yes D.1 Experiments on 2D Toy Datasets ... D.2 Experiment on the MNIST Dataset ... To test our principle on a more realistic task, we use the MNIST dataset, focusing on the binary classification of digits 3 versus 5 .
Dataset Splits No The paper mentions generating a 2D toy dataset and using the MNIST dataset for binary classification of digits '3' versus '5'. It specifies using full batches for GD or batch sizes for SGD, but does not explicitly provide details about training/validation/test splits (e.g., percentages, counts, or references to predefined splits) for either dataset, which is necessary for reproduction.
Hardware Specification No The paper describes the experimental setup, model architectures, and training procedures but does not explicitly mention any specific hardware used for running the experiments (e.g., specific CPU, GPU models, or cloud computing instances).
Software Dependencies No We use PyTorch for all implementations.
Experiment Setup Yes D.1 Experiments on 2D Toy Datasets ... Training Setup. Networks are trained for 100, 000 steps using an exponential loss function. The learning rate is 0.1 for MLP experiments and 0.001 for the CNN experiment. For Gradient Descent (GD, Exp. 1, 2, 5), the full batch is used in each step. For Stochastic Gradient Descent (SGD, Exp. 3, 4), we use a batch size of 16. ... D.2 Experiment on the MNIST Dataset ... Both models are trained for 1000 epochs using SGD with a learning rate of 0.001 and a batch size of 64. Data is normalized. Identical mini-batches are used for the narrow and wide networks at each step.