Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling

Authors: Xinxing Shi, Xiaoyu Jiang, Mauricio A รlvarez

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on tasks including representation learning, data imputation, and conditional generation, we demonstrate that our approach outperforms other GPVAE variants in both predictive performance and computational efficiency. Empirical experiments demonstrate that our approach improves both predictive accuracy and training speed compared to existing GPVAE baselines.
Researcher Affiliation Academia 1Department of Computer Science, University of Manchester, Manchester, UK. Correspondence to: Xinxing Shi <EMAIL>, Xiaoyu Jiang <EMAIL>, Mauricio A. Alvarez <EMAIL>.
Pseudocode No The paper describes the inference methods using mathematical formulations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is open-sourced at https://github.com/shixinxing/NNGPVAE-official.
Open Datasets Yes Our experiments start with latent representation learning on the synthetic moving ball data from Pearce (2020). Rotated MNIST consists of sequences of handwritten digit images from the MNIST dataset (Le Cun et al., 1998)... The dataset has 50,000/10,000 training/test sequences, each containing 10 frames... created by Krishnan et al. (2015). The Mu Jo Co dataset collects physical simulation data from the Deep Mind Control Suite (Rubanova et al., 2019). The first dataset, Jura (Goovaerts, 1997)... while the second dataset, SPE10 (Christie & Blunt, 2001).
Dataset Splits Yes Rotated MNIST... The dataset has 50,000/10,000 training/test sequences... This experiment involves 4,000 training and 1,000 test sequences... The Mu Jo Co dataset... The series set is then split into training, validation, and test subsets by 320/80/100.
Hardware Specification Yes The experiments are run on an NVIDIA A100-SXM4 or V100-SXM2 GPU of a high-performance cluster... Most training time is estimated on an NVIDIA RTX-4090 GPU, except for the missing pixel imputation task, which is tested on an RTX-2080-Ti due to software compatibility.
Software Dependencies No For fair comparisons, most scalable models (including our models, SVGPVAE, MGPVAE, SGPBAE, VAE, HI-VAE and GP models) are implemented in Py Torch (Paszke et al., 2019) and GPy Torch (Gardner et al., 2018). We use the modern similarity search package, Faiss (Johnson et al., 2019), for nearest neighbour searches. Specific version numbers for these software components are not provided in the text.
Experiment Setup Yes We summarise the experimental settings in Table 5. (Moving Ball)... The settings are listed in Table 7. (Corrupted Frame Imputation)... Additional setups are presented in Table 10. (Mu Jo Co Hopper Physics)... More details are in Table 13. (Jura)... The experimental settings are listed in Table 14. (SPE10). These tables include hyperparameters such as "Latent dimensionality", "Optimizer Adam, lr", "Training epochs", "Mini-batch size", "Trade-off parameter ฮฒ".