Trading Information between Latents in Hierarchical Variational Autoencoders
Authors: Tim Z. Xiao, Robert Bamler
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the features of our hierarchical information trading framework, we run large-scale grid searches over a two-dimensional rate space using two different implementations of HVAEs and three different data sets. We trained 441 different HVAEs for each data set/model combination, scanning the rate-hyperparameters (β2, β1) over a 21 × 21 grid ranging from 0.1 to 10 on a log scale in both directions (see Figure 1 on page 2, right panels). |
| Researcher Affiliation | Academia | Tim Z. Xiao University of Tübingen & IMPRS-IS zhenzhong.xiao@uni-tuebingen.de Robert Bamler University of Tübingen robert.bamler@uni-tuebingen.de |
| Pseudocode | No | The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | All code necessary to reproduce the results in this paper is available at https://github.com/timxzz/HIT/ |
| Open Datasets | Yes | We used the SVHN (Netzer et al., 2011) and CIFAR-10 (Krizhevsky, 2009) data sets (both 32 × 32 pixel color images), and MNIST (Le Cun et al., 1998) (28 × 28 binary pixel images). |
| Dataset Splits | No | The paper mentions using SVHN, CIFAR-10, and MNIST datasets and refers to a 'training set' and 'labeled test set', but does not provide specific details on the dataset splits (e.g., percentages, sample counts for training, validation, and testing, or specific references to standard splits used). For example, it doesn't specify if a validation set was used for hyperparameter tuning or its size/split. |
| Hardware Specification | Yes | Each model took about 2 hours to train on an RTX-2080Ti GPU (≈ 27 hours in total for each data set/model combination using 32 GPUs in parallel). |
| Software Dependencies | No | The paper mentions using 'scikit-learn' for classifiers and 'ResNet-18' and 'DenseNet-121' for classifiers, but does not provide specific version numbers for these libraries or any other software dependencies. |
| Experiment Setup | Yes | We trained 441 different HVAEs for each data set/model combination, scanning the rate-hyperparameters (β2, β1) over a 21 × 21 grid ranging from 0.1 to 10 on a log scale in both directions. Table 2: Model architecture details for generalized top-down HVAEs (GHVAEs) used in Section 5. Conv and Conv Transp denote the convolutional and transposed convolutional layer, which has the corresponding input: input channel, output channel, kernel size, stride, padding. FC represents fully connected layer. [...] z1 dims: 512 z2 dims: 32 σx = 0.71 Total params: 475811 |