*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Authors: Dmitry Tsarkov, Tibor Tihon, Nathan Scales, Nikola Momchev, Danila Sinopalnikov, Nathanael Schärli9949-9957
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using this suite, we conduct a series of experiments investigating the ability of Transformers to benefit from increased training size under conditions of fixed computational cost. We show that compositional generalization remains a challenge at all training sizes, and we show that increasing the scope of natural language leads to consistently higher error rates, which are only partially offset by increased training data. |
| Researcher Affiliation | Industry | Google Research, Brain Team {tsar,ttihon,nkscales,nikola,sinopalnikov,schaerli}@google.com |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Available at https://github.com/google-research/googleresearch/tree/master/star_cfq |
| Open Datasets | Yes | We present here *-CFQ1, a suite of datasets building on the same overall structure and base rule set as CFQ, but with two key differences intended to facilitate investigation of the scalability of solutions to the semantic parsing task: 1Available at https://github.com/google-research/googleresearch/tree/master/star_cfq |
| Dataset Splits | Yes | Keysers et al. (2020) introduced the Compositional Freebase Questions (CFQ), which is a simple but realistic and large natural language dataset that is specifically designed to measure compositional generalization. ... The authors release a number of MCD splits for CFQ, and show that there is a strong negative correlation between the accuracy of three standard sequence-to-sequence architectures and the compound divergence. |
| Hardware Specification | No | The paper mentions "fixed computational cost" and "fixed model size and fixed training steps" but does not specify the underlying hardware (e.g., GPU/CPU models, memory) used for these computations. |
| Software Dependencies | No | The paper mentions using a Transformer architecture but does not specify software dependencies like specific versions of PyTorch, TensorFlow, or other libraries. |
| Experiment Setup | Yes | In each experiment, the model size and number of training steps (and hence, computational cost) are held constant, with hyperparameters as described in Appendix E. We use a Transformer architecture with hyperparameters described in Appendix E. |