Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Disentangling and mitigating the impact of task similarity for continual learning
Authors: Naoki Hiratani
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate consistent results in a permuted MNIST task with latent variables. Overall, this work provides insights into when continual learning is difficult and how to mitigate it. [...] Furthermore, we test our key predictions numerically in a permuted MNIST task with a latent structure. |
| Researcher Affiliation | Academia | Naoki Hiratani Department of Neuroscience Washington University in St Louis St Louis, MO 63110 EMAIL |
| Pseudocode | No | The paper describes methods mathematically and textually but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source codes for all numerical results are made publicly available at https://github.com/nhiratani/transfer_ retention_model. |
| Open Datasets | Yes | We used permuted MNIST dataset [34, 22], a common benchmark for continual learning, but with addition of the latent space. |
| Dataset Splits | No | The paper describes training and testing procedures and parameters like epochs and learning rates, but it does not explicitly mention the use of a separate validation dataset or how a validation split was performed. |
| Hardware Specification | No | Numerical experiments were conducted in standard laboratory GPUs and CPUs. The paper does not provide specific models or detailed specifications for the hardware used. |
| Software Dependencies | No | The paper states that source code is available but does not explicitly list specific software dependencies or their version numbers within the text. |
| Experiment Setup | Yes | We set the latent variable dimensionality Ns = 30, the input width Nx = 3000, and the output width Ny = 10. The student weight W was initialized as the zero matrix, and updated with the full gradient descent with learning rate η = 0.001. [...] We set the hidden layer width to Nh = 1500. The input and output widths were set to Nx = 784 and Ny = 10. [...] We set the mini-batch size to 300 and the learning rate to η = 0.01, and trained the network for 100 epochs per task. |