Learning from Nested Data with Ornstein Auto-Encoders
Authors: Youngwon Choi, Sungdong Lee, Joong-Ho Won
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the high performance of PSOAE in the three key tasks of generative models: exemplar generation, style transfer, and new concept generation. |
| Researcher Affiliation | Academia | Youngwon Choi 1 2 Sungdong Lee 1 Joong-Ho Won 1 1Department of Statistics, Seoul National University. 2Current affiliation: UCLA Center for Vision & Imaging Biomarkers. |
| Pseudocode | Yes | Algorithm 1 Product-space OAE training |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | For the MNIST data, randomly selected 40,357 images were used for training, and the rest were used for testing. |
| Dataset Splits | Yes | Hyperparameters were hand-tuned using the performance on validation datasets. |
| Hardware Specification | No | The paper mentions training durations (e.g., "it took 3300 epochs (100 iterations per epoch)") but does not specify the hardware used (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions using the ADAM optimizer (Kingma & Ba, 2014) and adapting network architectures from WAE (Tolstikhin et al., 2018), but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For all the experiments, X = Rd X and Z = Rd Z with Euclidean metric d(x, x ) = ||x x ||2. The prior distribution PZ of the latent variable Z follows model (12). The independent standard normal prior PE0 = N(0, Id V) and PB = N(0, Id I) were set over I = Rd I and V = Rd V, respectively. The identity encoder QB|X0 and the within-unit variation encoder QE0|B,X0 were also Gaussian: Bi|{X0 = xi j} iid N(µB(xi j), σ2 B(xi j)Id I), Ei j|{B = bi, X0 = xi j} iid N(µE(xi j, bi), σ2 E(xi j, bi)Id V), with the mean functions µB : X I, µE : X I V and the variance functions σ2 B : X R++, σ2 E : X I R++. For the VGGFace2 experiments, µB and σ2 B were initialized with a pre-trained classifier (Cao et al., 2018). The µE and σE were designed to share most of the network to prevent overfitting. ... The optimization was conducted with the ADAM optimizer (Kingma & Ba, 2014). For VGGFace2, it took 3300 epochs (100 iterations per epoch) to declare convergence, where most of significant reductions occurred within the first 700 epochs. Hyperparameters were hand-tuned using the performance on validation datasets. The network architectures was adapted from the WAE at Tolstikhin et al. (2018). ... For the VGGFace2 dataset, each face image was cropped and rescaled to a common size of 128 by 128 pixels. |