Learning Structure from the Ground up---Hierarchical Representation Learning by Chunking

Authors: Shuchen Wu, Noemi Elteto, Ishita Dasgupta, Eric Schulz

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide learning guarantees on an idealized version of HCM, and demonstrate that HCM learns meaningful and interpretable representations in a human-like fashion. Our model can be extended to learn visual, temporal, and visual-temporal chunks. The interpretability of the learned chunks can be used to assess transfer or interference when the environment changes. Finally, in an f MRI dataset, we demonstrate that HCM learns interpretable chunks of functional coactivation regions and hierarchical modular and sub-modular structures supported by the neuroscientific literature.
Researcher Affiliation Academia Shuchen Wu Computational Principles of Intelligence Lab Max Planck Institute for Biological Cybernetics Tübingen, Germany shuchen.wu@tuebingen.mpg.de Noémi Éltet o Department of Computational Neuroscience Max Planck Institute for Biological Cybernetics Tübingen, Germany noemi.elteto@tuebingen.mpg.de Ishita Dasgupta Computational Cognitive Science Lab Department of Psychology Princeton University dasgupta.ishita@gmail.com Eric Schulz Computational Principles of Intelligence Lab Max Planck Institute for Biological Cybernetics Tübingen, Germany eric.schulz@tuebingen.mpg.de
Pseudocode Yes Pseudo-code for both algorithms can be found in the SI.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] They will be included in the supplementary material and publicly available.
Open Datasets Yes Here we compare the chunk learning behavior of HCM to the learning characteristics of humans. To that end, we used data collected from a sequence learning study by [23] with 47 participants under the license CC-BY 4.0. We used a developmental data set provided by the nilearn package with BSD License [35] and originally collected by [36] with its corresponding IRB approval.
Dataset Splits No The paper discusses varying sequence lengths for training ('sequence length increasing from 50 to 3000') and using existing data, but it does not specify explicit train/validation/test splits with percentages or sample counts.
Hardware Specification No Results obtained run without special computing resources.
Software Dependencies No The paper mentions using the 'nilearn package' but does not specify its version. No other software dependencies with version numbers are mentioned in the main text.
Experiment Setup No Details on the experiments will be included in the supplementary. The main text does not contain specific hyperparameters or training configurations for HCM or RNNs, beyond RNN architecture (3-layer, 40 hidden units).