BRUNO: A Deep Recurrent Model for Exchangeable Data

Authors: Iryna Korshunova, Jonas Degrave, Ferenc Huszar, Yarin Gal, Arthur Gretton, Joni Dambre

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The advantages of our architecture are demonstrated on learning tasks that require generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, and anomaly detection.
Researcher Affiliation Collaboration Iryna Korshunova Ghent University iryna.korshunova@ugent.be Jonas Degrave Ghent University jonas.degrave@ugent.be Ferenc Huszár Twitter fhuszar@twitter.com Yarin Gal University of Oxford yarin@cs.ox.ac.uk Arthur Gretton Gatsby Unit, UCL arthur.gretton@gmail.com Joni Dambre Ghent University joni.dambre@ugent.be
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at github.com/Ira Korshunova/bruno.
Open Datasets Yes The model was trained on Omniglot [10] same-class image sequences of length 20... More generated samples from convolutional and non-convolutional architectures trained on MNIST [11], Fashion-MNIST [22] and CIFAR-10 [9] are given in the appendix.
Dataset Splits Yes The model was trained on Omniglot [10] same-class image sequences of length 20 and we used the train-test split and preprocessing as defined by Vinyals et al. [21]. Namely, we resized the images to 28 28 pixels and augmented the dataset with rotations by multiples of 90 degrees yielding 4,800 and 1,692 classes for training and testing respectively.
Hardware Specification Yes Training was run on a single Titan X GPU for 24 hours.
Software Dependencies No The paper mentions software components like "Adam [7] optimizer" (Appendix C) but does not provide specific version numbers for these, nor for any programming languages or deep learning frameworks used.
Experiment Setup Yes All models were trained using Adam [7] optimizer with a learning rate of 10^ 4 and a batch size of 16.