Learning Structure from the Ground up---Hierarchical Representation Learning by Chunking
Authors: Shuchen Wu, Noemi Elteto, Ishita Dasgupta, Eric Schulz
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide learning guarantees on an idealized version of HCM, and demonstrate that HCM learns meaningful and interpretable representations in a human-like fashion. Our model can be extended to learn visual, temporal, and visual-temporal chunks. The interpretability of the learned chunks can be used to assess transfer or interference when the environment changes. Finally, in an f MRI dataset, we demonstrate that HCM learns interpretable chunks of functional coactivation regions and hierarchical modular and sub-modular structures supported by the neuroscientific literature. |
| Researcher Affiliation | Academia | Shuchen Wu Computational Principles of Intelligence Lab Max Planck Institute for Biological Cybernetics Tübingen, Germany shuchen.wu@tuebingen.mpg.de Noémi Éltet o Department of Computational Neuroscience Max Planck Institute for Biological Cybernetics Tübingen, Germany noemi.elteto@tuebingen.mpg.de Ishita Dasgupta Computational Cognitive Science Lab Department of Psychology Princeton University dasgupta.ishita@gmail.com Eric Schulz Computational Principles of Intelligence Lab Max Planck Institute for Biological Cybernetics Tübingen, Germany eric.schulz@tuebingen.mpg.de |
| Pseudocode | Yes | Pseudo-code for both algorithms can be found in the SI. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] They will be included in the supplementary material and publicly available. |
| Open Datasets | Yes | Here we compare the chunk learning behavior of HCM to the learning characteristics of humans. To that end, we used data collected from a sequence learning study by [23] with 47 participants under the license CC-BY 4.0. We used a developmental data set provided by the nilearn package with BSD License [35] and originally collected by [36] with its corresponding IRB approval. |
| Dataset Splits | No | The paper discusses varying sequence lengths for training ('sequence length increasing from 50 to 3000') and using existing data, but it does not specify explicit train/validation/test splits with percentages or sample counts. |
| Hardware Specification | No | Results obtained run without special computing resources. |
| Software Dependencies | No | The paper mentions using the 'nilearn package' but does not specify its version. No other software dependencies with version numbers are mentioned in the main text. |
| Experiment Setup | No | Details on the experiments will be included in the supplementary. The main text does not contain specific hyperparameters or training configurations for HCM or RNNs, beyond RNN architecture (3-layer, 40 hidden units). |