Sample-Conditioned Hypothesis Stability Sharpens Information-Theoretic Generalization Bounds

Authors: Ziqiao Wang, Yongyi Mao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present new information-theoretic generalization guarantees through the a novel construction of the "neighboring-hypothesis" matrix and a new family of stability notions termed sample-conditioned hypothesis (SCH) stability. Our approach yields sharper bounds that improve upon previous information-theoretic bounds in various learning scenarios. Notably, these bounds address the limitations of existing information-theoretic bounds in the context of stochastic convex optimization (SCO) problems, as explored in the recent work by Haghifam et al. (2023).
Researcher Affiliation Academia Ziqiao Wang University of Ottawa zwang286@uottawa.ca Yongyi Mao University of Ottawa ymao@uottawa.ca
Pseudocode No The paper focuses on mathematical proofs and theoretical derivations and does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about making its code open source or links to a code repository.
Open Datasets No The paper uses theoretical examples (e.g., Example 1, Example 2) to illustrate its findings, which involve synthetic distributions or abstract setups, but it does not use or provide access to publicly available datasets for empirical training or evaluation.
Dataset Splits No The paper is theoretical and uses abstract examples for analysis rather than performing empirical experiments with defined training, validation, or test dataset splits.
Hardware Specification No The paper does not mention any specific hardware used for computations or experiments.
Software Dependencies No The paper discusses algorithms like GD and SGD but does not specify any software packages, libraries, or their version numbers.
Experiment Setup No The paper is theoretical and analyzes properties of algorithms under certain conditions (e.g., number of GD iterations, learning rate in Example 1), but these are part of the theoretical problem setup, not a description of a reproducible experimental setup with specific hyperparameters for empirical runs.