reproducibilityindex.ai

On the Statistical Benefits of Curriculum Learning

Authors: Ziping Xu, Ambuj Tewari

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we study the beneﬁts of CL in the multitask linear regression problem under both structured and unstructured settings. For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum. Our results reveal that adaptive learning can be fundamentally harder than the oracle learning in the unstructured setting, but it merely introduces a small extra term in the structured setting. To connect theory with practice, we provide justiﬁcation for a popular empirical method that selects tasks with highest local prediction gain by comparing its guarantees with the minimax rates mentioned above.To compliment the theoretical analyses, we conduct simulations studies by applying actual SGD with tasks chosen to maximize the local prediction gain.
Researcher Affiliation	Academia	1Department of Statistics, University of Michigan, Ann Arbor.
Pseudocode	Yes	Algorithm 1 CL by optimistic scheduling
Open Source Code	No	The paper does not provide any statement or link indicating that its source code is publicly available.
Open Datasets	No	The paper describes how synthetic data was generated for simulations: "The true parameters of all the tasks are sampled from N(0, 0.001Id). On expectation, the transfer distance 2 t,T between task t and the target task is about 0.01d. The input x s are sampled from the same distribution N(0, Id) for all the tasks." This is not a publicly accessible dataset with a specific name or link.
Dataset Splits	No	The paper mentions "total number of observations n" for simulations but does not specify explicit train/validation/test dataset splits (e.g., percentages or counts) or refer to standard predefined splits.
Hardware Specification	No	The paper describes simulation studies but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run these simulations.
Software Dependencies	No	The paper does not mention any specific software or library names with version numbers that would be required to reproduce the experiments.
Experiment Setup	Yes	We set T = 5 and σ2 t = 0.001, 0.01, 0.1, 1, 1 for t = 1, . . . , 5, respectively. Note that the 5-th task is the target task. We test the effects of total number of observations n = 10, 50, 100, 500, 1000 and the effects of dimension d = 5, 10, 50, 100. By default, we set n = 1000 and d = 5. In our experiment, η = 0.85.