Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Theory of Curriculum Learning, with Convex Loss Functions
Authors: Daphna Weinshall, Dan Amir
JMLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Although methods based on this concept have been empirically shown to improve performance of several machine learning algorithms, no theoretical analysis has been provided even for simple cases. To address this shortfall, we start by formulating an ideal definition of difficulty score the loss of the optimal hypothesis at a given datapoint. We analyze the possible contribution of curriculum learning based on this score in two convex problems linear regression, and binary classification by hinge loss minimization. We show that in both cases, the convergence rate of SGD optimization decreases monotonically with the difficulty score, in accordance with earlier empirical results. We also prove that when the difficulty score is fixed, the convergence rate of SGD optimization is monotonically increasing with respect to the loss of the current hypothesis at each point. |
| Researcher Affiliation | Academia | Daphna Weinshall EMAIL Dan Amir EMAIL School of Computer Science and Engineering Hebrew University of Jerusalem Jerusalem 91904, Israel |
| Pseudocode | No | The paper consists of definitions, theorems, lemmas, proofs, and mathematical derivations without structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper focuses on theoretical analysis using abstract concepts of 'training examples' and 'distribution D' without referring to any specific named public datasets or providing access information for any data used. |
| Dataset Splits | No | The paper is theoretical in nature, presenting mathematical analyses and proofs, and does not involve experimental setup requiring dataset splits. |
| Hardware Specification | No | The paper is a theoretical work and does not describe any experimental procedures or hardware specifications used for running experiments. |
| Software Dependencies | No | The paper presents theoretical analysis and does not mention any specific software dependencies or version numbers required to replicate experimental results. |
| Experiment Setup | No | The paper is a theoretical study providing mathematical proofs and analyses, and therefore does not include specific experimental setup details, hyperparameters, or training configurations. |