Learning Manifold Dimensions with Conditional Variational Autoencoders

Authors: Yijia Zheng, Tong He, Yixuan Qiu, David P Wipf

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 4 supports our theoretical conclusions and analysis with numerical experiments on both synthetic and real-world datasets. 4 Experiments In this section we first corroborate our previous analysis in a controllable, synthetic environment; later we extend to real-world datasets to further support our conclusions.
Researcher Affiliation Collaboration Yijia Zheng1 Tong He2 Yixuan Qiu3 David Wipf2 1 Department of Statistics, Purdue University 2 Amazon Web Services 3 School of Statistics and Management, Shanghai University of Finance and Economics zheng709@purdue.edu, {htong, daviwipf}@amazon.com, qiuyixuan@sufe.edu.cn
Pseudocode No The paper describes models and processes in prose and mathematical formulations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/zhengyjzoe/manifold-dimensions-cvae
Open Datasets Yes Real-world Data We also investigate model behaviors via MNIST [23] and Fashion MNIST [22].
Dataset Splits No For all experiments with synthetic data, the training set size is 100,000.
Hardware Specification No The paper states 'See Appendix' for details regarding compute and resources used. However, the appendix is not provided in the given document, so the specific hardware specifications cannot be extracted from the main paper text.
Software Dependencies No The paper does not specify any software dependencies (e.g., programming languages, libraries, frameworks) with their version numbers required for replication.
Experiment Setup Yes The VAE/CVAE model architectures used for our experiments are described in Section A of the appendix. Specifically, using the synthetic dataset, we vary the ambient dimension d {10, 20, 30} and the ground-truth manifold dimension r {2, 4, 6, 8, 10} and compare with the estimated number of active dimensions with κ = {5, 20}. Table 3 shows the results, whereby the CVAE correctly learns that AD = r t across all values of t. For this purpose, we set c to be the first t dimensions of u by equating Gc to a t-dim identity matrix It, and let d = κ = 20, and r = 10. We next investigate how the initialization of γ impacts VAE/CVAE model convergence as related to the discussion in Section 3.2. Results with synthetic data and d = 20, κ = 20, r = 10, and for the CVAE, t = 5 are shown in Table 7 as the initial γ is varied.