Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Authors: Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei, Mahdieh Soleymani Baghshah

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we build upon and extend the recent results of (Xu & Raginsky, 2020) to analyze the MER in Bayesian learning and derive information-theoretic bounds on it. We formulate the problem as a (constrained) rate-distortion optimization and show how the solution can be bounded above and below by two other rate-distortion functions that are easier to study. The lower bound represents the minimum possible excess risk achievable by any process using R bits of information from the parameter W. For the upper bound, the optimization is further constrained to use R bits from the training set, a setting which relates MER to information-theoretic bounds on the generalization gap in frequentist learning. We derive information-theoretic bounds on the difference between these upper and lower bounds and show that they can provide order-wise tight rates for MER under certain conditions. This analysis gives more insight into the information-theoretic nature of Bayesian learning as well as providing novel bounds.
Researcher Affiliation Academia 1Department of Computer Engineering, Sharif University of Technology, Tehran, Iran 2Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not mention providing open-source code for the methodology described.
Open Datasets No The paper is theoretical and does not conduct experiments with datasets. It discusses theoretical models such as Gaussian Location Model and Linear Regression but does not use empirical data.
Dataset Splits No The paper is theoretical and does not involve dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not mention any hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not list any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations.