Predicting Inductive Biases of Pre-Trained Models

Authors: Charles Lovering, Rohan Jha, Tal Linzen, Ellie Pavlick

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments with both synthetic and naturalistic data, we find strong evidence (statistically significant correlations) supporting this hypothesis.
Researcher Affiliation Academia Brown University Department of Computer Science {charles lovering@, rohan jha@alumni., ellie pavlick@} brown.edu New York University Department of Linguistics and Center for Data Science linzen@nyu.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at: https://github.com/cjlovering/predicting-inductive-biases.
Open Datasets No Complete details about implementation of these templates (and all data) will be released upon acceptance.
Dataset Splits Yes Test and validation sets consist of 1,000 examples each from Sboth, Sneither, St-only, Ss-only. Test and validation sets consist of 1000 examples each from Sboth, Sneither, Ss-only.
Hardware Specification No The paper mentions 'computational resources and services at the Center for Computation and Visualization, Brown University' but does not specify any exact hardware components like GPU or CPU models.
Software Dependencies No The paper mentions software like Hugging Face, PyTorch Lightning, and specific models (t5-base, bert-base-uncased) and optimizers (AdamW, Adam), but it does not provide specific version numbers for these software dependencies (e.g., PyTorch version, Hugging Face Transformers version).
Experiment Setup Yes We fix all hyperparameters, which are reported in Table 6. Table 6 includes: random seed 1, 2, 3; batch size 128; cumulative mdl block sizes (%); s-only rates (%); lr 2e-5; hidden size 300.