reproducibilityindex.ai

Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks

Authors: Victor Boutin, Rishav Mukherji, Aditya Agrawal, Sabine Muzellec, Thomas Fel, Thomas Serre, Rufin VanRullen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here, we study how different inductive biases shape the latent space of Latent Diffusion Models (LDMs). Along with standard LDM regularizers (KL and vector quantization), we explore supervised regularizations (including classification and prototype-based representation) and contrastive inductive biases (using Sim CLR and redundancy reduction objectives). We demonstrate that LDMs with redundancy reduction and prototype-based regularizations produce near-human-like drawings (regarding both samples recognizability and originality) better mimicking human perception (as evaluated psychophysically). Overall, our results suggest that the gap between humans and machines in one-shot drawings is almost closed.
Researcher Affiliation	Academia	1Artificial and Natural Intelligence Toulouse Institute, Université de Toulouse, Toulouse, France. 2Centre de Recherche Cerveau & Cognition CNRS, Universite de Toulouse, France 3Carney Institute for Brain Science, Brown University
Pseudocode	Yes	Algorithm 1: VQVAE pseudo-code Algorithm 2: Prototype-based regularizer pseudo-code Algorithm 3: Sim CLR regularizer pseudo-code Algorithm 4: Barlow regularizer pseudo-code
Open Source Code	Yes	The code to train all described models is available at http://anonymous.4open.science/r/Latent Matters-526B.
Open Datasets	Yes	As done in previous work [31, 30, 11], we use the Omniglot [11] and the Quick Draw-FS [30] datasets to compare humans and machines on the one-shot drawing task...The databases we use are already in open access.
Dataset Splits	No	The paper specifies training and test splits for the datasets but does not explicitly mention a dedicated validation set split percentage or absolute counts for validation data.
Hardware Specification	Yes	All the experiments of this paper have been performed using Quadro-RTX600 GPUs with 16 GB memory.
Software Dependencies	No	The paper mentions using the 'Adam optimizer [69]' but does not provide specific version numbers for software dependencies or libraries such as Python, PyTorch, TensorFlow, or other key components required for replication.
Experiment Setup	Yes	We train the model using the Mean Squared Error loss with a batch size of 128 for the reconstruction, along with different regularizations to study its effects. For both datasets, we use the Adam optimizer [69] with a weight decay of 10^-5 and a learning rate of 10^-4. The RAEs on the Quick Draw dataset were trained for 200 epochs and 300 epochs on the Omniglot dataset. Note that when trained on the Omniglot dataset, we use a learning rate scheduler in which the learning rate is divided by 4 every 70 epoch.