Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diversity vs. Recognizability: Human-like generalization in one-shot generative models

Authors: Victor Boutin, Lakshya Singhal, Xavier Thomas, Thomas Serre

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here, we propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity (i.e., intra-class variability). Using this framework, we perform a systematic evaluation of representative one-shot generative models on the Omniglot handwritten dataset. We ﬁrst show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space. Extensive analyses of the effect of key model parameters further revealed that spatial attention and context integration have a linear contribution to the diversity-recognizability trade-off.
Researcher Affiliation	Academia	1 Artiﬁcial and Natural Intelligence Toulouse Institute, Université de Toulouse, France 2 Carney Institute for Brain Science, Dpt. of Cognitive Linguistic & Psychological Sciences Brown University, Providence, RI 02912
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Our code could be found at https://github.com/serre-lab/diversity_vs_recognizability.
Open Datasets	Yes	In this article, we use the Omniglot dataset [32] with a weak generalization split [42].
Dataset Splits	No	The paper describes training and testing splits, but it does not explicitly provide details about a separate validation dataset split (e.g., percentages, sample counts, or how it was created).
Hardware Specification	No	The paper mentions that hardware specifications are in the Supplementary Information at section S18, but the main paper does not contain specific details like GPU models or CPU types.
Software Dependencies	No	The paper mentions various models and networks (e.g., VAE, GAN, Inception v3 Net), but it does not provide specific version numbers for any software libraries or dependencies used (e.g., PyTorch, TensorFlow, Python).
Experiment Setup	Yes	For all algorithms listed in section 2.2 we have explored different hyper-parameters (see section 4.2 for more details). ... We evaluate the effect of the context on the position of the VAE-NS models on the diversity-recognizability space by varying the number of samples used to compute the context statistics during the training phase (from 2 to 20 samples). ... We have varied the number of attentional steps from 20 to 90. ... One can operate such a modulation by changing the β coefﬁcient in the ELBO loss function [24].