Recognizable Information Bottleneck
Authors: Yilin Lyu, Xin Liu, Mingyang Song, Xinyue Wang, Yaxin Peng, Tieyong Zeng, Liping Jing
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on several commonly used datasets demonstrate the effectiveness of the proposed method in regularizing the model and estimating the generalization gap. |
| Researcher Affiliation | Academia | 1Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University 2Department of Mathematics, School of Science, Shanghai University 3The Chinese University of Hong Kong {yilinlyu, xin.liu, mingyang.song, xinyuewang, lpjing}@bjtu.edu.cn, yaxin.peng@shu.edu.cn, zeng@math.cuhk.edu.hk |
| Pseudocode | Yes | Algorithm 1 Optimization of Recognizable Information Bottleneck (RIB) |
| Open Source Code | Yes | Code is available at https://github.com/lvyilin/Recog IB. |
| Open Datasets | Yes | Our experiments mainly conduct on three widely-used datasets: Fashion-MNIST [Xiao et al., 2017], SVHN [Netzer et al., 2011] and CIFAR10 [Krizhevsky and Hinton, 2009]. We also give the results on MNIST and STL10 [Coates et al., 2011] in Appendix C. |
| Dataset Splits | No | No explicit training/validation/test dataset splits are provided within the paper. It mentions using a validation set as a ghost set ('Unless otherwise stated, the validation set is used as the ghost set.') and references an external paper for the data setting, but doesn't specify the splits in detail. |
| Hardware Specification | Yes | All the experiments are implemented with PyTorch and performed on eight NVIDIA RTX A4000 GPUs. |
| Software Dependencies | No | The paper states 'All the experiments are implemented with PyTorch' but does not specify the version number of PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We use a DNN model composed of a 4-layer CNN (128-128-256-1024) and a 2-layer MLP (1024-512) as the encoder, and use a 4-layer MLP (1024-1024-1) as the recognizability critic. We train the learning model using Adam optimizer [Kingma and Ba, 2015] with betas of (0.9, 0.999) and train the recognizability critic using SGD with momentum of 0.9 as it is more stable in practice. All learning rates are set to 0.001, and the models are trained 100 epochs using the cosine annealing learning rate scheme with a batch size of 128. The trade-off parameter β is selected from {10 1, 100, 101, 102} according to the desired regularization strength as discussed later. |