Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning
Authors: Thomas Miconi
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As seen in Figure 2, the system successfully evolves an architecture that can automatically acquire a novel, unseen meta-learning task (left panel, red dashed curve). |
| Researcher Affiliation | Industry | Thomas Miconi 1; 1ML Collective. Correspondence to: Thomas Miconi <thomas.miconi@gmail.com>. |
| Pseudocode | No | The paper describes the algorithms and processes verbally and with equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All code is available at https://github.com/Thomas Miconi/Learning To Learn Cog Tasks |
| Open Datasets | Yes | To provide a sizeable number of computationally tractable cognitive tasks, we use the formalism of Yang et al. (2019), which implements a large number of simple cognitive tasks from the animal neuroscience literature (memory-guided saccades, comparing two successive stimuli, etc.) in a common format. |
| Dataset Splits | No | The paper discusses 'training tasks' and a 'withheld test task' but does not explicitly provide information on standard training/validation/test dataset splits with percentages or sample counts. The term 'validation' for a data split is not used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions the use of 'Adam optimizer' but does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | Each generation is a batch of 500 individuals. Each block is composed of 400 trials, each of which lasts 1000 ms. Following Yang et al. (2019), we use τ = 100 ms and simulation timesteps of 20 ms. Perturbations occur independently for each neuron with a probability of 0.1 at each timestep; perturbations are uniformly distributed in the [ 0.5, 0.5] range. We set τH = 1000 ms, η = 0.03. At generation 0, W is initialized with Gaussian weights with mean 0 and standard deviation 1.5/ N, where N = 70 is the number of neurons in the network... while all values of Π are initialized to 0.5. Evolution runs over 1000 generations for DMS. We feed evolutionary gradients to the Adam optimizer, with a learning rate of 0.003. |