Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization

Authors: Qing Guo, Junya Chen, Dong Wang, Yuewei Yang, Xinwei Deng, Jing Huang, Larry Carin, Fan Li, Chenyang Tao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We consider an extensive range of tasks to validate FLO and benchmark it against state-of-the-art solutions. Our code is available from https://github.com/qingguo666/FLO. All experiments are implemented with PyTorch. Comparison to baseline MI bounds. We start by comparing FLO to the following popular competing variational estimators: NWJ, TUBA, and Info NCE. We use the bilinear critic implementation for all models which maximally encourages both sample efficiency and code simplicity, and this strategy does perform best based on our observations. We consider the synthetic benchmark from [62], where (X Rd, Y Rd) is jointly standard Gaussian with diagonal cross-correlation parameterized by ρ [0, 1). We report d = 10 and ρ [0, 0.99] here (other studies only report ρ up to 0.9, which is less challenging.), providing a reasonable coverage of the range of MI one may encounter in empirical settings.
Researcher Affiliation Collaboration Qing Guo1, Junya Chen2, Dong Wang2, Yuewei Wang2, Xinwei Deng1, Lawrence Carin2,3, Fan Li2, Jing Huang4, Chenyang Tao2,4, 1Virginia Tech 2Duke University 3KAUST 4Amazon
Pseudocode Yes Algorithm 1 FLO
Open Source Code Yes Our code is available from https://github.com/qingguo666/FLO.
Open Datasets Yes Self-supervised learning (SSL)... In this experiment, we follow the SSL setup described in the Sim CLR paper [15]:
Dataset Splits No The paper mentions 'training epochs' and 'Top1 Test Accuracy' for datasets like Cifar10, implying the use of train and test sets. However, it does not explicitly provide the split percentages, sample counts, or refer to specific predefined validation splits in the main text.
Hardware Specification No The acknowledgements section mentions computational resources from 'Virginia Tech', 'CCI AI testbed', and 'XSEDE' including 'PSC Bridges-2 and SDSC Expanse', but it does not specify concrete hardware details like exact GPU or CPU models.
Software Dependencies No The paper states 'All experiments are implemented with PyTorch.' but does not provide a version number for PyTorch or any other software dependencies.
Experiment Setup No The paper states: 'Limited by space, we present only the key results in the main text, and defer ablation studies and details of our experimental setups to the Appendix.' This indicates that specific hyperparameters or detailed training configurations are not provided in the main body of the paper.