Learning Optimal Representations with the Decodable Information Bottleneck
Authors: Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our framework in practical settings, focusing on: (i) the relation between V-sufficiency and Alice s best achievable performance; (ii) the relation between V-minimality and generalization; (iii) the consequence of a mismatch between VAlice and the functional family VBob w.r.t. which Z is sufficient or minimal especially in IB s setting VBob = U; (iv) the use of our framework to predict generalization of trained networks. Many of our experiments involve sweeping over the complexity of families V V V+, we do this by varying widths of MLPs with V U in the infinite width limit [40, 41]. |
| Researcher Affiliation | Collaboration | Yann Dubois Facebook AI Research yannd@fb.com Douwe Kiela Facebook AI Research dkiela@fb.com David J. Schwab Facebook AI Research CUNY Graduate Center dschwab@fb.com Ramakrishna Vedantam Facebook AI Research ramav@fb.com |
| Pseudocode | Yes | Figure 2: Practical DIB (a) Pseudo-code for ˆLDIB(D) |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | Right two plots: 2D representations encoded by an multi-layer perceptron (MLP) for odd-even classification of 200 MNIST [22] examples. |
| Dataset Splits | No | The paper discusses 'train and test performance' and 'train-test gap' but does not explicitly mention validation data splits or their proportions. |
| Hardware Specification | No | The paper describes model architectures like 'Res Net18 encoder' and '3-MLP encoder', but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments. |
| Software Dependencies | No | The paper mentions 'Pytorch' as a reference but does not specify version numbers for PyTorch or any other software dependencies used in the experiments. |
| Experiment Setup | Yes | We use a 3-MLP encoder with around 21M parameters and a 1024 dimensional Z. Since we want to investigate the generalization of ERMs resulting from Bob s criterion, we do not use (possibly implicit) regularizers such as large learning rate [44]. For more experimental details see Appx. D.1. |