Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Authors: Hassan Hafez-Kolahi, Zeinab Golgooni, Shohreh Kasaei, Mahdieh Soleymani

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Obtaining generalization bounds for learning algorithms is one of the main subjects studied in theoretical machine learning. In recent years, information-theoretic bounds on generalization have gained the attention of researchers. This approach provides an insight into learning algorithms by considering the mutual information between the model and the training set. In this paper, a probabilistic graphical representation of this approach is adopted and two general techniques to improve the bounds are introduced, namely conditioning and processing. These techniques can be used to improve the bounds by either sharpening them or increasing their applicability. It is demonstrated that the proposed framework provides a simple and unified way to explain a variety of recent tightening results. New improved bounds derived by utilizing these techniques are also proposed.
Researcher Affiliation Academia Hassan Hafez-Kolahi Department of Computer Engineering Sharif University of Technology hafez@ce.sharif.edu; Zeinab Golgooni Department of Computer Engineering Sharif University of Technology golgooni@ce.sharif.edu; Shohreh Kasaei Department of Computer Engineering Sharif University of Technology kasaei@sharif.edu; Mahdieh Soleymani Baghshah Department of Computer Engineering Sharif University of Technology soleymani@sharif.edu
Pseudocode No The paper contains mathematical formulas, theorems, lemmas, and figures, but no pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., a specific repository link or an explicit code release statement) for the methodology described.
Open Datasets No The paper is theoretical and discusses concepts like a "training set" in a general context, but does not describe experiments using a specific dataset or provide concrete access information for any dataset.
Dataset Splits No The paper is theoretical and does not describe specific dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware, thus no hardware specifications are provided.
Software Dependencies No The paper is theoretical and does not describe any experiments that would require specific software dependencies, thus none are provided.
Experiment Setup No The paper is theoretical and does not describe any experimental setup details, hyperparameters, or training configurations.