Few-Shot Learning via Learning the Representation, Provably
Authors: Simon Shaolei Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper studies few-shot learning via representation learning... First, we study the setting where this common representation is low-dimensional and provide a risk bound of O( dk / n2 ) on the target task for the linear representation class... We further extend this result to handle a general representation function class and obtain a similar result. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained... and show that representation learning can fully utilize all n1T samples from source tasks. While representation learning has achieved tremendous success in a variety of applications (Bengio et al., 2013), its theoretical studies are limited. In existing theoretical work, the most natural algorithm is to explicitly look for the optimal representation given source data... Our result on high-dimensional representations shows that the capacity control for representation learning does not have to be through explicit low dimensionality. |
| Researcher Affiliation | Collaboration | Simon S. Du1 , Wei Hu2 , Sham M. Kakade1,3 , Jason D. Lee2 , and Qi Lei2 1University of Washington 2Princeton University 3Microsoft Research |
| Pseudocode | No | The paper does not contain any explicit pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper does not provide any statement about releasing open-source code or links to a code repository. |
| Open Datasets | No | The paper uses abstract terms like "T source tasks with n1 data per task" and "n2 data" for the target task, and refers to "distributions µt over the joint data space". It does not specify any named public datasets or provide links/citations for data access. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental data splits for training, validation, or testing. It refers to "n1 i.i.d. samples" from source tasks and "n2 i.i.d. samples" from the target task for theoretical analysis. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not describe specific software dependencies or versions used for experiments. |
| Experiment Setup | No | The paper is theoretical and does not provide details about experimental setup, such as hyperparameters or system-level training settings. |