Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning

Authors: Reuben Dorent, Polina Golland, William (Sandy) Wells

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our lower bound is tight when applied to MI estimation. We compared our lower bound to state-of-the-art neural estimators of variational lower bound across a range of established reference scenarios. Our lower bound estimator consistently provides a stable, low-variance estimate of a tight lower bound on MI. We also demonstrate its practical usefulness in the context of the Information Bottleneck framework.
Researcher Affiliation	Academia	Reuben Dorent Inria EMAIL Polina Golland MIT EMAIL William Wells III Harvard, MIT EMAIL
Pseudocode	Yes	Algorithm 1: Algorithmic implementation of the Ξ function as the inverse of its known inverse Ξ^-1
Open Source Code	Yes	Implementation details2 are available in Appendix E.2. 2https://github.com/Reuben Do/JSDlowerbound
Open Datasets	Yes	Table 1: Generalization performance (%) on MNIST dataset. Performance is evaluated by the mean classification accuracy on the MNIST test set after training on the MNIST training set.
Dataset Splits	Yes	Table 1: Generalization performance (%) on MNIST dataset. Performance is evaluated by the mean classification accuracy on the MNIST test set after training on the MNIST training set.
Hardware Specification	Yes	The computational time analysis is developed on a server with CPU Intel Xeon Platinum 8468 48-Core Processor and an NVIDIA GPU H100 and reported in Table 7.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers like Python, PyTorch, or CUDA versions. It mentions using 'Adam optimizer' but not with a specific version or framework.
Experiment Setup	Yes	Training is performed for 4000 steps using the Adam optimizer and batch size N = 64, matching the architecture and hyperparameters from prior work for comparability. As the function Ξ is strictly increasing, maximizing Ξ (log 2 LCE) is equivalent to minimizing the cross-entropy loss LCE. Therefore, the approximation of Ξ is not used during optimization. Implementation details2 are available in Appendix E.2.