Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Authors: Jishnu Ray Chowdhury, Cornelia Caragea
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments and Results; In Table 1, we compare the empirical time-memory trade-offs of the most relevant Tree-Rv NN models. |
| Researcher Affiliation | Academia | Jishnu Ray Chowdhury Cornelia Caragea Computer Science University of Illinois Chicago EMAIL EMAIL |
| Pseudocode | No | No section or figure explicitly labeled "Pseudocode" or "Algorithm" was found. |
| Open Source Code | Yes | Our code is available at: https://github.com/JRC1995/Beam Recursion Family/. |
| Open Datasets | Yes | List Ops was originally introduced by Nangia and Bowman [70].; Long Range Arena (LRA): LRA is a set of tasks designed to evaluate the capacities of neural models for modeling long-range dependencies [92].; Logical Inference was introduced by Bowman et al. [6]. |
| Dataset Splits | Yes | We use the original development set for validation. We test on the original test set (near-IID split); the length generalization splits from Havrylov et al. [38]...; Our actual development set is a random sample of 10,000 data points from the filtered training set. |
| Hardware Specification | Yes | Table 1: Empirical time and (peak) memory consumption for various models on an RTX A6000.; All models were trained on a single Nvidia RTX A6000. |
| Software Dependencies | No | The paper mentions software like S4D, but does not provide specific version numbers for underlying libraries or programming languages (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | For RIR-EBT-GRC, we use a beam size of 7 for all tasks except Retrieval LRA where we use a beam size of 5. Every other hyperparameters are unchanged from BT-GRC for RIR-models or BBT-GRC for the earlier tasks. ... We initialize S4D (whether when using the pure S4D model or S4D for pre-chunk processing) in S4D-Inv mode and use billinear discretization. For LRA tasks, we use the same hyperparameters as Gu et al. [30] for S4D. |