The Acquisition of Physical Knowledge in Generative Neural Networks

Authors: Luca M. Schulze Buschoff, Eric Schulz, Marcel Binz

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We find that while our models are able to accurately predict a number of physical processes, their learning trajectories under both hypotheses do not follow the developmental trajectories of children. ... First, we show how both hypotheses can be instantiated in a β-variational autoencoder (β-VAE) framework. We then probe models with different degrees of complexity and optimization on physical reasoning tasks using violationof-expectation (VOE) methods (Piloto et al., 2018; Smith et al., 2019). Finally, we compare the learning trajectories of these artificial systems to the developmental trajectories of children.
Researcher Affiliation Academia 1MPRG Computational Principles of Intelligence, Max Planck Institute for Biological Cybernetics, T ubingen, Germany.
Pseudocode No The paper describes the model architecture and objective in Section 3.1, but no pseudocode or algorithm blocks are provided.
Open Source Code No The complete code for this project, including our model implementation, is available upon request.
Open Datasets No For each of these processes, we generated training data sets inspired by experiments from developmental psychology using the Unity game engine (Unity Technologies, 2005).
Dataset Splits Yes It was randomly split into 99.000 training sequences and 1000 validation sequences.
Hardware Specification Yes The models were trained on a NVIDIA Quadro RTX 5000 for roughly 7 days.
Software Dependencies No The models were implemented in Py Torch (Paszke et al., 2019). For all models, the size of the stochastic hidden dimension st was kept at 20, while the size of the deterministic hidden dimension ht was set to 200, as in previous implementations of the RSSM (Hafner et al., 2019; Saxena et al., 2021).
Experiment Setup Yes For all models, the size of the stochastic hidden dimension st was kept at 20, while the size of the deterministic hidden dimension ht was set to 200, as in previous implementations of the RSSM (Hafner et al., 2019; Saxena et al., 2021). ... The models were trained for 180 epochs using a batch size of 32. The loss function was optimized using the Adam optimiser with a learning rate of 0.001 (Kingma & Ba, 2014), which was divided by 10 every 50 epochs.