Online learning of long-range dependencies
Authors: Nicolas Zucchet, Robert Meier, Simon Schug, Asier Mujika, Joao Sacramento
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic memory problems and on the challenging long-range arena benchmark suite reveal that our algorithm performs competitively, establishing a new standard for what can be achieved through online learning. |
| Researcher Affiliation | Academia | Department of Computer Science, ETH Zürich {nzucchet,romeier,sschug,asierm,rjoao}@ethz.ch |
| Pseudocode | No | The paper provides detailed mathematical derivations of the algorithm's updates in Appendix A.2, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper states 'We base our implementation on the S5 [22] code base4.' with footnote 4 linking to 'https://github.com/lindermanlab/S5'. This is a third-party codebase they used as a base, not an explicit release of their own code for the specific methodology presented in this paper. |
| Open Datasets | Yes | Finally, we evaluate our method on three tasks of the Long Range Arena [24]: a sequential version of CIFAR [41], LISTOPS [42] and IMDB [43] |
| Dataset Splits | Yes | For additional experimental details and hyperparameter configurations, we refer to Appendix B. (Appendix B contains Tables 4, 5, 6 which list hyperparameters like 'Training samples', 'Batch-size', 'Epochs', 'Warmup', implicitly defining the training process and usage of data for tuning/validation.) |
| Hardware Specification | Yes | The training time for our online learning rule on a single Nvidia RTX3090 GPU for SCIFAR, IMDB and LISTOPS was respectively 36, 10 and 40 hours. |
| Software Dependencies | No | The paper mentions using the 'Adam W optimizer' and basing their implementation on 'S5 [22] code base', but it does not specify version numbers for programming languages, libraries, or other software dependencies. |
| Experiment Setup | Yes | For additional experimental details and hyperparameter configurations, we refer to Appendix B. (Tables 4, 5, 6 provide detailed hyperparameters such as 'Batch-size', 'Base learning rate', 'Epochs', 'Dropout probability', etc.) |