reproducibilityindex.ai

$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's t and Power Divergence

Authors: Juno Kim, Jaehyuk Kwon, Mincheol Cho, Hyunjong Lee, Joong-Ho Won

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	t3VAE demonstrates superior generation of low-density regions when trained on heavytailed synthetic data. Furthermore, we show that t3VAE signiﬁcantly outperforms other models on Celeb A and imbalanced CIFAR-100 datasets.
Researcher Affiliation	Academia	1Department of Mathematical Informatics, The University of Tokyo 2Center for Advanced Intelligence Project, RIKEN 3Department of Statistics, Seoul National University
Pseudocode	Yes	A summary of our framework is provided in Algorithm 1 to assist with implementation.
Open Source Code	Yes	The code is available on Github.
Open Datasets	Yes	We now showcase the effectiveness of our model on high-dimensional data via both reconstruction and generation tasks in Celeb A (Liu et al., 2015; 2018)... we conduct reconstruction experiments with the CIFAR100-LT dataset (Cao et al., 2019), which is a long-tailed version of the original CIFAR-100 (Krizhevsky, 2009).
Dataset Splits	Yes	We ﬁrst generate 200K train data, 200K validation data, and 500K test data from the heavy-tailed bimodal distribution (22).
Hardware Specification	Yes	All experiments are implemented via Python 3.8.10 with the Py Torch package (Paszke et al., 2019) version 1.13.1+cu117, and run on Linux Ubuntu 20.04 with Intel Xeon Silver 4114 @ 2.20GHz processors, an Nvidia Titan V GPU with 12GB memory, CUDA 11.3 and cu DNN 8.2.
Software Dependencies	Yes	All experiments are implemented via Python 3.8.10 with the Py Torch package (Paszke et al., 2019) version 1.13.1+cu117, and run on Linux Ubuntu 20.04 with Intel Xeon Silver 4114 @ 2.20GHz processors, an Nvidia Titan V GPU with 12GB memory, CUDA 11.3 and cu DNN 8.2.
Experiment Setup	Yes	In the training process, we use a batch size of 128 and employ the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 1 10 3 and weight decay 1 10 4. Moreover, we adapt early stopping using validation data with patience 15 to prevent overﬁtting. All VAE models are trained for 50 epochs using a batch size of 128 and a latent variable dimension of 64 with the Adam optimizer.