reproducibilityindex.ai

Learning Robust Representations via Multi-View Information Bottleneck

Authors: Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we demonstrate the effectiveness of our model against state-of-the-art baselines in both the multi-view and single-view setting. In the single-view setting, we also estimate the coordinates on the Information Plane for each of the baseline methods as well as our method to validate the theory in Section 3.
Researcher Affiliation	Collaboration	Marco Federici University of Amsterdam m.federici@uva.nl Anjan Dutta University of Exeter a.dutta@exeter.ac.uk Patrick Forr e University of Amsterdam p.d.forre@uva.nl Nate Kushmann Microsoft Research nkushman@microsoft.com Zeynep Akata University of Tuebingen zeynep.akata@uni-tuebingen.de
Pseudocode	Yes	Algorithm 1: LMIB(θ, ψ; β, B)
Open Source Code	Yes	Code available at https://github.com/mfederici/Multi-View-Information-Bottleneck
Open Datasets	Yes	Dataset. The Sketchy dataset (Sangkloy et al., 2016) consists of 12,500 images and 75,471 hand-drawn sketches of objects from 125 classes. As in Liu et al. (2017), we also include another 60,502 images from the Image Net (Deng et al., 2009) from the same classes... and Dataset. The MIR-Flickr dataset (Huiskes & Lew, 2008) consists of 1M images... and Dataset. The dataset is generated from MNIST.
Dataset Splits	Yes	The labeled set contains 5 different splits of train, validation and test sets of size 10K/5K/10K respectively.
Hardware Specification	Yes	The Titan Xp and Titan V used for this research were donated by the NVIDIA Corporation.
Software Dependencies	No	All the experiments have been performed using the Adam optimizer with a learning rate of 10 4 for both encoders and the estimation network.
Experiment Setup	Yes	To facilitate the optimization, the hyper-parameter β is slowly increased during training, starting from a small value 10 4 to its ﬁnal value with an exponential schedule. and Each training iteration used batches of size B = 128. and All the experiments have been performed using the Adam optimizer with a learning rate of 10 4 for both encoders and the estimation network.