Masked Prediction: A Parameter Identifiability View

Authors: Bingbin Liu, Daniel J. Hsu, Pradeep Ravikumar, Andrej Risteski

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical While SSL has been enjoying a rapid growth on the empirical front, theoretical understanding of why and when SSL works is still nascent. In no small part, this is because formalizing the desired guarantees seems challenging. For instance, the focus of SSL has largely been on learning good features, which in practice has been quantified by downstream performance on various benchmark datasets. To provide theoretical underpinning to this, one needs to make extra assumptions on the relationship between the self-supervised prediction task and the downstream tasks. This work provides a positive answer for broad classes of HMMs. We also note that we have focused here on population analyses, and model identifiability. It would be of interest to build off this to develop and analyze the corresponding finite-sample learning algorithms for parametric generative models given SSL tasks, with sample complexity results, both in the realizable case, as well as in the agnostic case where we have model mis-specification.
Researcher Affiliation Academia Bingbin Liu Carnegie Mellon University bingbinl@cs.cmu.edu Daniel Hsu Columbia University djhsu@cs.columbia.edu Pradeep Ravikumar Carnegie Mellon University pradeepr@cs.cmu.edu Andrej Risteski Carnegie Mellon University aristesk@andrew.cmu.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets No The paper is theoretical and does not use or describe datasets for experimental training.
Dataset Splits No The paper is theoretical and does not report on experiments, so no dataset splits for validation are provided.
Hardware Specification No The paper is theoretical and does not report on experiments, thus no specific hardware details are provided.
Software Dependencies No The paper is theoretical and does not report on experiments, thus no specific software dependencies with version numbers are provided.
Experiment Setup No The paper is theoretical and does not report on experiments, thus no specific experimental setup details are provided.