reproducibilityindex.ai

Schema-learning and rebinding as mechanisms of in-context learning and emergence

Authors: Sivaramakrishnan Swaminathan, Antoine Dedieu, Rajkumar Vasudeva Raju, Murray Shanahan, Miguel Lazaro-Gredilla, Dileep George

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We substantiate the above argument using empirical results on three datasets: (a) the GINC benchmark introduced in [3], (b) a suite of algorithm learning tasks that we introduce in our LIALT datasets, and (c) a zero-shot word usage induction task on a CSCG language model.
Researcher Affiliation	Industry	Sivaramakrishnan Swaminathan Antoine Dedieu Rajkumar Vasudeva Raju Murray Shanahan Miguel Lázaro-Gredilla Dileep George Google Deep Mind {sivark,adedieu,rajvraju,mshanahan,lazarogredilla,dileepgeorge}@google.com
Pseudocode	Yes	Algorithm 1 Fast rebinding algorithm; Algorithm 2 Prompt completion
Open Source Code	No	The paper does not provide an explicit statement or link to its open-source code.
Open Datasets	Yes	Dataset: The GINC dataset [3] introduced for studying ICL... We train a single CSCG with 50 clones on the GINC dataset... To test for this capability, we train a CSCG on the Pre Co dataset [26], which is a large-scale English dataset for coreference resolution.
Dataset Splits	No	The paper describes its training and test sets but does not provide explicit details on a validation split (e.g., percentages or counts) from the overall dataset.
Hardware Specification	No	The paper does not specify the hardware used for its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We train a single CSCG with 50 clones on the GINC dataset for 100 full-batch EM iterations using a pseudocount [6] of ϵ = 10 2... We parameterize CSCG capacity via this proportionality factor the overallocation ratio... We train CSCGs for an increasing sequence of overallocation ratios on the training data with 500 EM iterations and a pseudocount of ϵ = 10 6. After running EM, we run 10 iterations of Viterbi training [23]. We use Algorithm 1 with ϵ = 10 6 and psurprise = 1 16 to rebind the emission matrix on each of these prompts