Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds
Authors: Hassan Hafez-Kolahi, Zeinab Golgooni, Shohreh Kasaei, Mahdieh Soleymani
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Obtaining generalization bounds for learning algorithms is one of the main subjects studied in theoretical machine learning. In recent years, information-theoretic bounds on generalization have gained the attention of researchers. This approach provides an insight into learning algorithms by considering the mutual information between the model and the training set. In this paper, a probabilistic graphical representation of this approach is adopted and two general techniques to improve the bounds are introduced, namely conditioning and processing. These techniques can be used to improve the bounds by either sharpening them or increasing their applicability. It is demonstrated that the proposed framework provides a simple and uniļ¬ed way to explain a variety of recent tightening results. New improved bounds derived by utilizing these techniques are also proposed. |
| Researcher Affiliation | Academia | Hassan Hafez-Kolahi Department of Computer Engineering Sharif University of Technology hafez@ce.sharif.edu; Zeinab Golgooni Department of Computer Engineering Sharif University of Technology golgooni@ce.sharif.edu; Shohreh Kasaei Department of Computer Engineering Sharif University of Technology kasaei@sharif.edu; Mahdieh Soleymani Baghshah Department of Computer Engineering Sharif University of Technology soleymani@sharif.edu |
| Pseudocode | No | The paper contains mathematical formulas, theorems, lemmas, and figures, but no pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a specific repository link or an explicit code release statement) for the methodology described. |
| Open Datasets | No | The paper is theoretical and discusses concepts like a "training set" in a general context, but does not describe experiments using a specific dataset or provide concrete access information for any dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe specific dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware, thus no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe any experiments that would require specific software dependencies, thus none are provided. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details, hyperparameters, or training configurations. |