Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Gaussian Lower Bound for the Information Bottleneck Limit
Authors: Amichai Painsky, Naftali Tishby
JMLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now demonstrate our suggested methodology with a simple illustrative example. Let X N(0, 1), W N(0, ϵ2) and Z N(µz, 1) be three normally distributed random variables, all independent of each other. ... We now examine our suggests multivariate approach in different setups. As in the univariate case, we draw samples from a given model and bound from below the mutual information I(X, Y ) according to (3). First, we apply the multivariate ACE procedure (Section 5.1) to achieve an upper bound to our objective. Then, we apply separate Gaussianization to ACE s outcome, to attain an immediate lower bound (Section 5.3.1). Further, we tighten this lower bound by replacing the separate Gaussianization with bi-terminal Gaussianization to ACE s outcome (Section 5.3.2). ... The plot on the left of Figure 4 demonstrates the results we achieve. The black curve on top is the approximated IB curve, using the reverse annealing procedure. The red curve on the bottom is a benchmark lower bound, achieved by simply applying the GIB to X and Y , as if they were jointly Gaussian. |
| Researcher Affiliation | Academia | Amichai Painsky EMAIL Naftali Tishby EMAIL School of Computer Science and Engineering and The Interdisciplinary Center for Neural Computation The Hebrew University of Jerusalem Givat Ram, Jerusalem 91904, Israel |
| Pseudocode | Yes | Algorithm 1 Alternating Gaussinized Conditional Expectations (AGCE) for the univariate case Algorithm 2 Bi-terminal multivariate Gaussianization |
| Open Source Code | Yes | A matlab implementation of our suggested approach is publicly available at the first author s web-page1. 1. https://sites.google.com/site/amichaipainsky/software |
| Open Datasets | No | The paper generates synthetic data based on described models (e.g., Gaussian mixture, exponential model) rather than using pre-existing public datasets. For instance, it states: "Let X N(0, 1), W N(0, ϵ2) and Z N(µz, 1) be three normally distributed random variables... Define Y as..." and "In this model, each component of X and W is exponentially distributed with a unit parameter... we define Y = X + W...". No specific link, DOI, or citation to an external open dataset is provided. |
| Dataset Splits | No | The paper discusses generating "10,000 independent draws of X and Y" for illustrative examples and uses terms like "discretization (via Gaussian quadratures)" but does not define explicit training, test, or validation splits for data. It's focused on evaluating bounds rather than model performance that would require such splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments or simulations. |
| Software Dependencies | No | The paper mentions that "A matlab implementation of our suggested approach is publicly available at the first author s web-page", indicating MATLAB as a software used. However, it does not specify a version number for MATLAB or any other software libraries or dependencies with their respective versions. |
| Experiment Setup | Yes | For example, we set µz = 10 and ϵ = 0.1. ... We use a Gaussian kernel with varying parameters to achieve the reported results. ... Since the reverse annealing was originally designed for discrete random variables, we apply discretization (via Gaussian quadratures) to our probability distributions is all of our experiments. |