Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Transfer Entropy Bottleneck: Learning Sequence to Sequence Information Transfer

Authors: Damjan Kalajdzievski, Ximeng Mao, Pascal Fortier-Poisson, Guillaume Lajoie, Blake Aaron Richards

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce three synthetic tasks on dual stream modeling problems. We provide experiments on these tasks to show that TEB allows one to improve the predictions of the target stream, and that it is applicable to various data modalities including images and time-series signals.
Researcher Affiliation Collaboration 1 Mila Quebec AI institute, Montreal, QC, Canada 2 Department of Neurology and Neurosurgery, Mc Gill University, Montréal, QC, Canada 3 Department of Computer Science and Operations Research, Université de Montréal, Montréal, QC, Canada 4 BIOS Health Ltd., Cambridge, UK 5 Department of Mathmatics and Statistics, Université de Montréal, Montréal, QC, Canada 6 School of Computer Science, Mc Gill University, Montréal, QC, Canada 7 Montreal Neurological Institute, Mc Gill University, Montréal, QC, Canada 8 CIFAR, Toronto, ON, Canada
Pseudocode No No explicit pseudocode or algorithm blocks are provided. The methodology is described mathematically and textually.
Open Source Code Yes Code of our implementations can be found at https://github.com/ximmao/Transfer Entropy Bottleneck.
Open Datasets Yes To experimentally test Theorems 1 & 2, we create a dataset of videos of rotating MNIST (Lecun et al., 1998) digits. ... The task is next step prediction for a colored bouncing balls (Sutskever et al., 2008) video... The multi-component sinusoids task dataset consists of 30k, 5k, and 5k time-series signal, for training, validation, and testing, respectively.
Dataset Splits Yes The rotating MNIST task was generated from a base dataset of videos of 500 examples of rotating MNIST digits (and an additional 500 validation and 500 testing base videos). (Appendix B.2); The needle in a haystack task base dataset consists of 142 videos of two colored bouncing balls consisting of 6 frames, in each of 7 distinct color classes (and an additional 75 validation, and 75 testing, base videos in each color class). (Appendix B.3); The multi-component sinusoids task dataset consists of 30k, 5k, and 5k time-series signal, for training, validation, and testing, respectively. (Appendix B.4)
Hardware Specification No All the experiments were conducted on a single GPU with 48G memory. (Appendix B.1)
Software Dependencies No All of the implementations were done using PyTorch (Paszke et al., 2019). For multi-component sinusoids task, the decoder d is neural ODE, implemented using the torchdiffeq package. (Appendix B.1)
Experiment Setup Yes We used Adam optimizer (Kingma & Ba, 2015) with the default parameters; an initial learning rate of 10 4 with no weight decay. All the experiments were conducted on a single GPU with 48G memory. For all models the latent dimension of 128 was used in all non convolutional hidden layers, including the hidden dimension size of the LSTMs. (Appendix B.1) and For the TEB model, we conducted a hyperparameter search for values of β between 1 and 1012... For D {5, 15, 20}, we used an initial β = 105, decreasing to 104 at epoch 300, and further decreasing to 103 at epoch 400. For D = 10 we used an initial β = 103, decreased to 102 at epoch 350, and decreased further to 10 at epoch 400. (Appendix B.3)