Learning Sparse Latent Representations with the Deep Copula Information Bottleneck
Authors: Aleksander Wieczorek*, Mario Wieser*, Damian Murezzan, Volker Roth
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on artiļ¬cial and real data. |
| Researcher Affiliation | Academia | Aleksander Wieczorek , Mario Wieser , Damian Murezzan, Volker Roth University of Basel, Switzerland {firstname.lastname}@unibas.ch |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing source code for the methodology or provide a link to a code repository. |
| Open Datasets | Yes | We consider the unnormalised Communities and Crime dataset Lyons et al. (1998) from the UCI repository1. The dataset consisted of 125 predictive, 4 non-predictive and 18 target variables with 2215 samples in total. In a preprocessing step, we removed all missing values from the dataset. In the end, we used 1901 observations with 102 predictive and 18 target variables in our analysis. 1http://archive.ics.uci.edu/ml/datasets/communities+and+crime+unnormalized |
| Dataset Splits | No | We split the samples into test (20k samples) and training (180k samples) sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions the use of the Adam optimizer but does not specify the versions of programming languages, libraries, or other software dependencies used for the experiments. |
| Experiment Setup | Yes | We use a latent layer with ten nodes that model the means of the ten-dimensional latent space t. The variance of the latent space is set to 1 for simplicity. The encoder as well as the decoder consist of a neural network with two fully-connected hidden layers with 50 nodes each. We use the softplus function as the activation function. Our model is trained using mini batches (size = 500) with the Adam optimiser (Kingma & Ba, 2014) for 70000 iterations using a learning rate of 0.0006. |