A Theoretical Analysis of Fine-tuning with Linear Teachers
Authors: Gal Shachaf, Alon Brutzkus, Amir Globerson
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results are corroborated by empirical evaluations. In Figure 1 we empirically verify the conclusions from the bound in (3)... We next evaluate the bound on fine-tuning tasks taken from the MNIST dataset [34]... |
| Researcher Affiliation | Collaboration | Gal Shachaf Blavatnik School of Computer Science, Tel Aviv University, Israel Alon Brutzkus Blavatnik School of Computer Science, Tel Aviv University, Israel Amir Globerson Blavatnik School of Computer Science, Tel Aviv University, Israel and Google Research |
| Pseudocode | No | The paper does not contain explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We next evaluate the bound on fine-tuning tasks taken from the MNIST dataset [34] |
| Dataset Splits | No | The paper mentions 'target training points' and training on target tasks but does not specify explicit training/validation/test dataset splits (e.g., percentages, sample counts, or predefined splits). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions general software libraries like Pytorch, NumPy, SciPy, and Matplotlib in its references, but does not specify their version numbers or other ancillary software dependencies required for replication. |
| Experiment Setup | No | The paper discusses optimization methods like gradient descent and gradient flow but does not provide specific experimental setup details such as concrete hyperparameter values (e.g., batch size, learning rates used in experiments), optimizer settings, or explicit training schedules. |