Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Dimensional Collapse in VQVAEs: Evidence and Remedies
Authors: Jiayou Zhang, Yifan Shen, Guangyi Chen, Le Song, Eric P Xing
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct large-scale controlled experiments varying dataset, architecture, and hyperparameters. We validate DCVQ across datasets and architectures, showing consistent gains in reconstruction quality and dimensional utilization. |
| Researcher Affiliation | Academia | 1MBZUAI 2Gen Bio AI 3Carnegie Mellon University EMAIL |
| Pseudocode | Yes | Code 1: Synthetic experiment showing eigenvalue suppression under k-means quantization. 1 import torch 2 import matplotlib.pyplot as plt 3 from sklearn.cluster import KMeans |
| Open Source Code | Yes | We will include anonymized code and detailed instructions to reproduce our main experimental results in the supplemental material before the final appendix submission deadline. |
| Open Datasets | Yes | We conduct experiments on three datasets: Image Net-256, Celeb A-64, and CIFAR-10. Our experiments use only publicly available datasets (CIFAR-10, Celeb A, and Image Net) under standard preprocessing. |
| Dataset Splits | No | Full training and testing details, including data splits, hyperparameters, and optimizer configurations, will be included in the submitted anonymized code and supplemental materials. |
| Hardware Specification | Yes | All experiments were conducted on NVIDIA A100 GPUs with 80GB of memory. |
| Software Dependencies | No | Our implementation is based on several publicly available Git Hub repositories: lucidrains/vector-quantize-pytorch [21] (MIT License) kakaobrain/rq-vae-transformer [9] (Apache 2.0 License) cfifty/rotation_trick [5] (No explicit license; used under academic fair use as per official ICLR release) thuanz123/enhancing-transformers (MIT License) Comp Vis/taming-transformers [4] (MIT License) |
| Experiment Setup | Yes | Table 1: Summary of hyperparameters explored in the large-scale controlled study. Table 4: Fixed hyperparameter configuration for the experiments in Section 3. Table 5: Non-architectural hyperparameters used in the experiments in Section 5. |