reproducibilityindex.ai

Mutual Information Gradient Estimation for Representation Learning

Authors: Liangjian Wen, Yiji Zhou, Lirong He, Mingyuan Zhou, Zenglin Xu

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results have indicated significant performance improvement in learning useful representation.
Researcher Affiliation	Academia	1 SMILE Lab, School of Computer Science and Engineering University of Electronic Science and Technology of China, Chengdu, China 2 Center for Artiﬁcial Intelligence Peng Cheng Laboratory, Shenzhen, China 3 Mc Combs School of Business University of Texas at Austin, Austin, United States 4 School of Computer Science and Technology Harbin Institute of Technology, Shenzhen, China
Pseudocode	Yes	Algorithm 1 MIGE (Circumstance I)
Open Source Code	No	The provided links (https://github.com/rdevon/DIM and https://github.com/alexalemi/vib_demo) are for the baseline models (DIM and DVB) used for comparison, not for the proposed MIGE method.
Open Datasets	Yes	We test DIM on image datasets CIFAR-10, CIFAR-100 and STL-10 to evaluate our MIGE. ... We demonstrate an implementation of the IB objective on permutation invariant MNIST using MIGE.
Dataset Splits	Yes	For consistent comparison, we follow the experiments of Deep Info Max(DIM)1 to set the experimental setup as in Hjelm et al. (2019). ... We adopt the same architecture and empirical settings used in Alemi et al. (2017)...
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided.
Software Dependencies	No	Pytorch is mentioned as the implementation framework, but no version number is provided for Pytorch or any other software dependencies.
Experiment Setup	Yes	For consistent comparison, we adopt the same architecture and empirical settings used in Alemi et al. (2017) except that the initial learning rate of 2e-4 is set for Adam optimizer, and exponential decay with decaying rate by a factor of 0.96 was set for every 2 epochs. The threshold of score function s Stein gradient estimator is set as 0.94.