Minimum Description Length and Generalization Guarantees for Representation Learning

Authors: Milad Sefidgaran, Abdellatif Zaidi, Piotr Krasnowski

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical simulations illustrate the advantages of well-chosen such priors over classical priors used in IB. The results shown in Fig. 2 indicate that the model trained using our priors achieves better ( 2.5%) performance in terms of both generalization error and population risk. We consider CIFAR10 [KH 09] image classification using a small CNN-based encoder and a linear decoder.
Researcher Affiliation Collaboration Milad Sefidgaran , Abdellatif Zaidi : , Piotr Krasnowski Paris Research Center, Huawei Technologies France : Universit e Gustave Eiffel, France
Pseudocode No The paper does not contain any sections explicitly labeled "Pseudocode" or "Algorithm", nor any visually structured algorithm blocks.
Open Source Code Yes The code used in the experiments is available at https://github.com/PiotrKrasnowski/MDL_and_Generalization_Guarantees_for_Representation_Learning.
Open Datasets Yes We consider CIFAR10 [KH 09] image classification using a small CNN-based encoder and a linear decoder.
Dataset Splits Yes The full dataset was split into a training set with 50,000 labeled images and a validation set with 10,000 labeled images, all of them of size 32 ˆ 32 ˆ 3.
Hardware Specification Yes Our prediction model was trained using Py Torch [PGM 19] and a GPU Tesla P100 with CUDA 11.0.
Software Dependencies Yes Our prediction model was trained using Py Torch [PGM 19] and a GPU Tesla P100 with CUDA 11.0. The Adam optimizer [KB15] (β1 0.5, β2 0.999) was used with an initial learning rate of 10 4 and an exponential decay of 0.97.
Experiment Setup Yes The Adam optimizer [KB15] (β1 0.5, β2 0.999) was used with an initial learning rate of 10 4 and an exponential decay of 0.97. The batch size was equal to 128 throughout the whole experiment. During the training phase, we jointly trained the encoder and the decoder parts for 200 epochs.