Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Authors: Thomas Miconi

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental As seen in Figure 2, the system successfully evolves an architecture that can automatically acquire a novel, unseen meta-learning task (left panel, red dashed curve).
Researcher Affiliation Industry Thomas Miconi 1; 1ML Collective. Correspondence to: Thomas Miconi <thomas.miconi@gmail.com>.
Pseudocode No The paper describes the algorithms and processes verbally and with equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes All code is available at https://github.com/Thomas Miconi/Learning To Learn Cog Tasks
Open Datasets Yes To provide a sizeable number of computationally tractable cognitive tasks, we use the formalism of Yang et al. (2019), which implements a large number of simple cognitive tasks from the animal neuroscience literature (memory-guided saccades, comparing two successive stimuli, etc.) in a common format.
Dataset Splits No The paper discusses 'training tasks' and a 'withheld test task' but does not explicitly provide information on standard training/validation/test dataset splits with percentages or sample counts. The term 'validation' for a data split is not used.
Hardware Specification No The paper does not provide specific details about the hardware used for running its experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions the use of 'Adam optimizer' but does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup Yes Each generation is a batch of 500 individuals. Each block is composed of 400 trials, each of which lasts 1000 ms. Following Yang et al. (2019), we use τ = 100 ms and simulation timesteps of 20 ms. Perturbations occur independently for each neuron with a probability of 0.1 at each timestep; perturbations are uniformly distributed in the [ 0.5, 0.5] range. We set τH = 1000 ms, η = 0.03. At generation 0, W is initialized with Gaussian weights with mean 0 and standard deviation 1.5/ N, where N = 70 is the number of neurons in the network... while all values of Π are initialized to 0.5. Evolution runs over 1000 generations for DMS. We feed evolutionary gradients to the Adam optimizer, with a learning rate of 0.003.