Variational Temporal Abstraction

Authors: Taesup Kim, Sungjin Ahn, Yoshua Bengio

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we demonstrate that our proposed method can model 2D and 3D visual sequence datasets with interpretable temporal structure discovery and that its application to jumpy imagination enables more efficient agent-learning in a 3D navigation task.
Researcher Affiliation Collaboration Taesup Kim1,3, , Sungjin Ahn2 , Yoshua Bengio1 1Mila, Université de Montréal, 2Rutgers University, 3Kakao Brain
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code of the implementation of our model is available at https://github.com/taesupkim/vta.
Open Datasets No The paper uses a self-generated 'bouncing balls' dataset and a self-collected dataset from a '3D maze environment' without providing public access information or citations to pre-existing public datasets. 'We generated a synthetic 2D visual sequence dataset called bouncing balls.' and 'Another sequence dataset is generated from the 3D maze environment by an agent that navigates the maze.'
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning into train/validation/test sets.
Hardware Specification No The paper mentions 'Kakao Brain cloud team for providing computing resources used in this work' but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup Yes During training, the length of observation sequence data X is set to T = 20 and the context length is Tctx = 5. Hyper-parameters related to sequence decomposition are set as Nmax = 5 and lmax = 10. For HRSSM, we used the same training setting as bouncing balls but different Nmax = 5 and lmax = 8 for the sequence decomposition. ...controlled by annealing the temperature τ of Gumbel-softmax towards small values from 1.0 to 0.1.