Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mutual Information Gradient Estimation for Representation Learning

Authors: Liangjian Wen, Yiji Zhou, Lirong He, Mingyuan Zhou, Zenglin Xu

ICLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results have indicated significant performance improvement in learning useful representation.
Researcher Affiliation Academia 1 SMILE Lab, School of Computer Science and Engineering University of Electronic Science and Technology of China, Chengdu, China 2 Center for Arti๏ฌcial Intelligence Peng Cheng Laboratory, Shenzhen, China 3 Mc Combs School of Business University of Texas at Austin, Austin, United States 4 School of Computer Science and Technology Harbin Institute of Technology, Shenzhen, China
Pseudocode Yes Algorithm 1 MIGE (Circumstance I)
Open Source Code No The provided links (https://github.com/rdevon/DIM and https://github.com/alexalemi/vib_demo) are for the baseline models (DIM and DVB) used for comparison, not for the proposed MIGE method.
Open Datasets Yes We test DIM on image datasets CIFAR-10, CIFAR-100 and STL-10 to evaluate our MIGE. ... We demonstrate an implementation of the IB objective on permutation invariant MNIST using MIGE.
Dataset Splits Yes For consistent comparison, we follow the experiments of Deep Info Max(DIM)1 to set the experimental setup as in Hjelm et al. (2019). ... We adopt the same architecture and empirical settings used in Alemi et al. (2017)...
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided.
Software Dependencies No Pytorch is mentioned as the implementation framework, but no version number is provided for Pytorch or any other software dependencies.
Experiment Setup Yes For consistent comparison, we adopt the same architecture and empirical settings used in Alemi et al. (2017) except that the initial learning rate of 2e-4 is set for Adam optimizer, and exponential decay with decaying rate by a factor of 0.96 was set for every 2 epochs. The threshold of score function s Stein gradient estimator is set as 0.94.