reproducibilityindex.ai

Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability

Authors: Jishnu Ray Chowdhury, Cornelia Caragea

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments and Results; In Table 1, we compare the empirical time-memory trade-offs of the most relevant Tree-Rv NN models.
Researcher Affiliation	Academia	Jishnu Ray Chowdhury Cornelia Caragea Computer Science University of Illinois Chicago jraych2@uic.edu cornelia@uic.edu
Pseudocode	No	No section or figure explicitly labeled "Pseudocode" or "Algorithm" was found.
Open Source Code	Yes	Our code is available at: https://github.com/JRC1995/Beam Recursion Family/.
Open Datasets	Yes	List Ops was originally introduced by Nangia and Bowman [70].; Long Range Arena (LRA): LRA is a set of tasks designed to evaluate the capacities of neural models for modeling long-range dependencies [92].; Logical Inference was introduced by Bowman et al. [6].
Dataset Splits	Yes	We use the original development set for validation. We test on the original test set (near-IID split); the length generalization splits from Havrylov et al. [38]...; Our actual development set is a random sample of 10,000 data points from the filtered training set.
Hardware Specification	Yes	Table 1: Empirical time and (peak) memory consumption for various models on an RTX A6000.; All models were trained on a single Nvidia RTX A6000.
Software Dependencies	No	The paper mentions software like S4D, but does not provide specific version numbers for underlying libraries or programming languages (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	For RIR-EBT-GRC, we use a beam size of 7 for all tasks except Retrieval LRA where we use a beam size of 5. Every other hyperparameters are unchanged from BT-GRC for RIR-models or BBT-GRC for the earlier tasks. ... We initialize S4D (whether when using the pure S4D model or S4D for pre-chunk processing) in S4D-Inv mode and use billinear discretization. For LRA tasks, we use the same hyperparameters as Gu et al. [30] for S4D.