Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Transfer Entropy Bottleneck: Learning Sequence to Sequence Information Transfer
Authors: Damjan Kalajdzievski, Ximeng Mao, Pascal Fortier-Poisson, Guillaume Lajoie, Blake Aaron Richards
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce three synthetic tasks on dual stream modeling problems. We provide experiments on these tasks to show that TEB allows one to improve the predictions of the target stream, and that it is applicable to various data modalities including images and time-series signals. |
| Researcher Affiliation | Collaboration | 1 Mila Quebec AI institute, Montreal, QC, Canada 2 Department of Neurology and Neurosurgery, Mc Gill University, Montréal, QC, Canada 3 Department of Computer Science and Operations Research, Université de Montréal, Montréal, QC, Canada 4 BIOS Health Ltd., Cambridge, UK 5 Department of Mathmatics and Statistics, Université de Montréal, Montréal, QC, Canada 6 School of Computer Science, Mc Gill University, Montréal, QC, Canada 7 Montreal Neurological Institute, Mc Gill University, Montréal, QC, Canada 8 CIFAR, Toronto, ON, Canada |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided. The methodology is described mathematically and textually. |
| Open Source Code | Yes | Code of our implementations can be found at https://github.com/ximmao/Transfer Entropy Bottleneck. |
| Open Datasets | Yes | To experimentally test Theorems 1 & 2, we create a dataset of videos of rotating MNIST (Lecun et al., 1998) digits. ... The task is next step prediction for a colored bouncing balls (Sutskever et al., 2008) video... The multi-component sinusoids task dataset consists of 30k, 5k, and 5k time-series signal, for training, validation, and testing, respectively. |
| Dataset Splits | Yes | The rotating MNIST task was generated from a base dataset of videos of 500 examples of rotating MNIST digits (and an additional 500 validation and 500 testing base videos). (Appendix B.2); The needle in a haystack task base dataset consists of 142 videos of two colored bouncing balls consisting of 6 frames, in each of 7 distinct color classes (and an additional 75 validation, and 75 testing, base videos in each color class). (Appendix B.3); The multi-component sinusoids task dataset consists of 30k, 5k, and 5k time-series signal, for training, validation, and testing, respectively. (Appendix B.4) |
| Hardware Specification | No | All the experiments were conducted on a single GPU with 48G memory. (Appendix B.1) |
| Software Dependencies | No | All of the implementations were done using PyTorch (Paszke et al., 2019). For multi-component sinusoids task, the decoder d is neural ODE, implemented using the torchdiffeq package. (Appendix B.1) |
| Experiment Setup | Yes | We used Adam optimizer (Kingma & Ba, 2015) with the default parameters; an initial learning rate of 10 4 with no weight decay. All the experiments were conducted on a single GPU with 48G memory. For all models the latent dimension of 128 was used in all non convolutional hidden layers, including the hidden dimension size of the LSTMs. (Appendix B.1) and For the TEB model, we conducted a hyperparameter search for values of β between 1 and 1012... For D {5, 15, 20}, we used an initial β = 105, decreasing to 104 at epoch 300, and further decreasing to 103 at epoch 400. For D = 10 we used an initial β = 103, decreased to 102 at epoch 350, and decreased further to 10 at epoch 400. (Appendix B.3) |