A Theoretical Analysis of Fine-tuning with Linear Teachers

Authors: Gal Shachaf, Alon Brutzkus, Amir Globerson

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results are corroborated by empirical evaluations. In Figure 1 we empirically verify the conclusions from the bound in (3)... We next evaluate the bound on fine-tuning tasks taken from the MNIST dataset [34]...
Researcher Affiliation Collaboration Gal Shachaf Blavatnik School of Computer Science, Tel Aviv University, Israel Alon Brutzkus Blavatnik School of Computer Science, Tel Aviv University, Israel Amir Globerson Blavatnik School of Computer Science, Tel Aviv University, Israel and Google Research
Pseudocode No The paper does not contain explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not include an explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes We next evaluate the bound on fine-tuning tasks taken from the MNIST dataset [34]
Dataset Splits No The paper mentions 'target training points' and training on target tasks but does not specify explicit training/validation/test dataset splits (e.g., percentages, sample counts, or predefined splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions general software libraries like Pytorch, NumPy, SciPy, and Matplotlib in its references, but does not specify their version numbers or other ancillary software dependencies required for replication.
Experiment Setup No The paper discusses optimization methods like gradient descent and gradient flow but does not provide specific experimental setup details such as concrete hyperparameter values (e.g., batch size, learning rates used in experiments), optimizer settings, or explicit training schedules.