On the Power of Foundation Models
Authors: Yang Yuan
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Unlike most machine learning theory papers, our paper does not have any assumptions on the data distribution or network structure. Instead, we take the bird s-eye view that is model oblivious, and only focuses on the structure defined by the pretext task. It is indeed possible that by designing a special network, one may get a more powerful model with better performance. However, we stick with our setting because: Empirically, people do not customize network structures for different tasks. Instead, they tend to use similar structures like Res Net (He et al., 2016) or Transformer (Vaswani et al., 2017). |
| Researcher Affiliation | Collaboration | 1IIIS, Tsinghua University 2Shanghai Artificial Intelligence Laboratory 3Shanghai Qi Zhi Institute. Correspondence to: Yang Yuan <yuanyang@tsinghua.edu.cn>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is theoretical and does not describe a new methodology for which source code would typically be provided or released. There are no statements or links regarding open-source code for the described theoretical framework. |
| Open Datasets | No | The paper is theoretical and does not perform experiments that involve training on specific public datasets. It refers to training in other research as context but does not conduct its own empirical training. |
| Dataset Splits | No | The paper is theoretical and does not perform experiments, thus it does not provide specific dataset split information for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any empirical experiments, thus it does not specify any hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not conduct experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any empirical experiments, and therefore does not include details on experimental setup such as hyperparameters or training configurations. |