Predicting Inductive Biases of Pre-Trained Models
Authors: Charles Lovering, Rohan Jha, Tal Linzen, Ellie Pavlick
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments with both synthetic and naturalistic data, we find strong evidence (statistically significant correlations) supporting this hypothesis. |
| Researcher Affiliation | Academia | Brown University Department of Computer Science {charles lovering@, rohan jha@alumni., ellie pavlick@} brown.edu New York University Department of Linguistics and Center for Data Science linzen@nyu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at: https://github.com/cjlovering/predicting-inductive-biases. |
| Open Datasets | No | Complete details about implementation of these templates (and all data) will be released upon acceptance. |
| Dataset Splits | Yes | Test and validation sets consist of 1,000 examples each from Sboth, Sneither, St-only, Ss-only. Test and validation sets consist of 1000 examples each from Sboth, Sneither, Ss-only. |
| Hardware Specification | No | The paper mentions 'computational resources and services at the Center for Computation and Visualization, Brown University' but does not specify any exact hardware components like GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like Hugging Face, PyTorch Lightning, and specific models (t5-base, bert-base-uncased) and optimizers (AdamW, Adam), but it does not provide specific version numbers for these software dependencies (e.g., PyTorch version, Hugging Face Transformers version). |
| Experiment Setup | Yes | We fix all hyperparameters, which are reported in Table 6. Table 6 includes: random seed 1, 2, 3; batch size 128; cumulative mdl block sizes (%); s-only rates (%); lr 2e-5; hidden size 300. |