Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learn2Mix: Training Neural Networks Using Adaptive Data Integration

Authors: Shyam Venkatasubramanian, Vahid Tarokh

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on benchmark datasets show that neural networks trained with learn2mix converge faster than those trained with existing approaches, achieving improved results for classification, regression, and reconstruction tasks under limited training resources and with imbalanced classes. Our empirical findings are supported by theoretical analysis.
Researcher Affiliation	Academia	Shyam Venkatasubramanian Duke University EMAIL Tarokh Duke University EMAIL
Pseudocode	Yes	Algorithm 1: Neural Network Training Via Learn2Mix Input: J (Original Training Dataset), θ (Initial NN Parameters), α (Initial Mixing Parameters), η (Learning Rate), γ (Mixing Rate), M (Batch Size), P (No. of Batches), E (Epochs) Output: θ (Trained NN Parameters) Algorithm 2: Updating Mixing Parameters Via Learn2Mix Input: α (Previous Mixing Parameters), L(θ) (Class-wise loss vector), γ (Mixing Rate) Output: α (Updated Mixing Parameters)
Open Source Code	Yes	Git Hub repository: https://github.com/shyamven/Learn2Mix.
Open Datasets	Yes	We first present the classification results on three benchmark datasets (MNIST [Deng, 2012], Fashion-MNIST [Xiao et al., 2017], CIFAR-10 [Krizhevsky et al., 2009]), and three standard datasets with manually imbalanced classes (Imagenette [Howard, 2020], CIFAR-100 [Krizhevsky et al., 2009], and IMDB [Maas et al., 2011]). ...for the regression task, we study two benchmark datasets with manually imbalanced classes (Wine Quality [Cortez et al., 2009], and California Housing [Géron, 2022]).
Dataset Splits	Yes	For the MNIST classification result from Section 4.1, the original training dataset, J, comprises N = 60000 samples, wherein the fixed-proportion mixing parameters (for default numerical class ordering of digits from 1 10) are: α = [0.0987, 0.1124, 0.0993, 0.1022, 0.0974, 0.0904, 0.0986, 0.1044, 0.0975, 0.0991]T The test dataset, K, comprises Ntest = 10000 samples, with class proportions equivalent to the class proportions in the base MNIST test dataset.
Hardware Specification	Yes	For the evaluations that follow, all training was performed on an NVIDIA GEForce RTX 3090 GPU.
Software Dependencies	No	The paper mentions optimizers like 'Adam optimizer' and 'RMSProp optimizer', and loss functions like 'Cross Entropy Loss' and 'Mean Squared Error (MSE) Loss', but does not provide specific version numbers for these or for underlying software frameworks or libraries (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	The complete list of model architectures and hyperparameters is in Section D of the Appendix. Table 3: Neural network training hyperparameters (grouped by task). Dataset Task Optimizer Learning Rate (η) Mixing Rate (γ) (Learn2Mix) Batch Size (M) MNIST Classification Adam 0.0001 0.1 1000