reproducibilityindex.ai

*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task

Authors: Dmitry Tsarkov, Tibor Tihon, Nathan Scales, Nikola Momchev, Danila Sinopalnikov, Nathanael Schärli9949-9957

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using this suite, we conduct a series of experiments investigating the ability of Transformers to beneﬁt from increased training size under conditions of ﬁxed computational cost. We show that compositional generalization remains a challenge at all training sizes, and we show that increasing the scope of natural language leads to consistently higher error rates, which are only partially offset by increased training data.
Researcher Affiliation	Industry	Google Research, Brain Team {tsar,ttihon,nkscales,nikola,sinopalnikov,schaerli}@google.com
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Available at https://github.com/google-research/googleresearch/tree/master/star_cfq
Open Datasets	Yes	We present here *-CFQ1, a suite of datasets building on the same overall structure and base rule set as CFQ, but with two key differences intended to facilitate investigation of the scalability of solutions to the semantic parsing task: 1Available at https://github.com/google-research/googleresearch/tree/master/star_cfq
Dataset Splits	Yes	Keysers et al. (2020) introduced the Compositional Freebase Questions (CFQ), which is a simple but realistic and large natural language dataset that is speciﬁcally designed to measure compositional generalization. ... The authors release a number of MCD splits for CFQ, and show that there is a strong negative correlation between the accuracy of three standard sequence-to-sequence architectures and the compound divergence.
Hardware Specification	No	The paper mentions "fixed computational cost" and "ﬁxed model size and ﬁxed training steps" but does not specify the underlying hardware (e.g., GPU/CPU models, memory) used for these computations.
Software Dependencies	No	The paper mentions using a Transformer architecture but does not specify software dependencies like specific versions of PyTorch, TensorFlow, or other libraries.
Experiment Setup	Yes	In each experiment, the model size and number of training steps (and hence, computational cost) are held constant, with hyperparameters as described in Appendix E. We use a Transformer architecture with hyperparameters described in Appendix E.