Few-Shot Learning via Learning the Representation, Provably

Authors: Simon Shaolei Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper studies few-shot learning via representation learning... First, we study the setting where this common representation is low-dimensional and provide a risk bound of O( dk / n2 ) on the target task for the linear representation class... We further extend this result to handle a general representation function class and obtain a similar result. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained... and show that representation learning can fully utilize all n1T samples from source tasks. While representation learning has achieved tremendous success in a variety of applications (Bengio et al., 2013), its theoretical studies are limited. In existing theoretical work, the most natural algorithm is to explicitly look for the optimal representation given source data... Our result on high-dimensional representations shows that the capacity control for representation learning does not have to be through explicit low dimensionality.
Researcher Affiliation Collaboration Simon S. Du1 , Wei Hu2 , Sham M. Kakade1,3 , Jason D. Lee2 , and Qi Lei2 1University of Washington 2Princeton University 3Microsoft Research
Pseudocode No The paper does not contain any explicit pseudocode blocks or algorithm listings.
Open Source Code No The paper does not provide any statement about releasing open-source code or links to a code repository.
Open Datasets No The paper uses abstract terms like "T source tasks with n1 data per task" and "n2 data" for the target task, and refers to "distributions µt over the joint data space". It does not specify any named public datasets or provide links/citations for data access.
Dataset Splits No The paper is theoretical and does not describe experimental data splits for training, validation, or testing. It refers to "n1 i.i.d. samples" from source tasks and "n2 i.i.d. samples" from the target task for theoretical analysis.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not describe specific software dependencies or versions used for experiments.
Experiment Setup No The paper is theoretical and does not provide details about experimental setup, such as hyperparameters or system-level training settings.