Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Theory of Curriculum Learning, with Convex Loss Functions

Authors: Daphna Weinshall, Dan Amir

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Although methods based on this concept have been empirically shown to improve performance of several machine learning algorithms, no theoretical analysis has been provided even for simple cases. To address this shortfall, we start by formulating an ideal deﬁnition of diﬃculty score the loss of the optimal hypothesis at a given datapoint. We analyze the possible contribution of curriculum learning based on this score in two convex problems linear regression, and binary classiﬁcation by hinge loss minimization. We show that in both cases, the convergence rate of SGD optimization decreases monotonically with the diﬃculty score, in accordance with earlier empirical results. We also prove that when the diﬃculty score is ﬁxed, the convergence rate of SGD optimization is monotonically increasing with respect to the loss of the current hypothesis at each point.
Researcher Affiliation	Academia	Daphna Weinshall EMAIL Dan Amir EMAIL School of Computer Science and Engineering Hebrew University of Jerusalem Jerusalem 91904, Israel
Pseudocode	No	The paper consists of definitions, theorems, lemmas, proofs, and mathematical derivations without structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper focuses on theoretical analysis using abstract concepts of 'training examples' and 'distribution D' without referring to any specific named public datasets or providing access information for any data used.
Dataset Splits	No	The paper is theoretical in nature, presenting mathematical analyses and proofs, and does not involve experimental setup requiring dataset splits.
Hardware Specification	No	The paper is a theoretical work and does not describe any experimental procedures or hardware specifications used for running experiments.
Software Dependencies	No	The paper presents theoretical analysis and does not mention any specific software dependencies or version numbers required to replicate experimental results.
Experiment Setup	No	The paper is a theoretical study providing mathematical proofs and analyses, and therefore does not include specific experimental setup details, hyperparameters, or training configurations.