A Neural Representation of Sketch Drawings
Authors: David Ha, Douglas Eck
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct several experiments with sketch-rnn for both conditional and unconditional vector image generation. We train sketch-rnn on various Quick Draw classes using various settings for w KL and record the breakdown of losses. To experiment with a diverse set of classes with varying complexities, we select the cat, pig, face, firetruck, garden, owl, mosquito and yoga class. We also experiment on multi-class datasets by concatenating different classes together to form (cat, pig) and (crab, face, pig, rabbit). The results for test set evaluation on various datasets are displayed in Table 1. |
| Researcher Affiliation | Industry | David Ha Google Brain hadavid@google.com Douglas Eck Google Brain deck@google.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make available a dataset of 50 million hand drawn vector images to encourage further development of generative modelling for vector images, and also release an implementation of our model as an open source project.1 1The code and dataset is available at https://magenta.tensorflow.org/sketch_rnn. |
| Open Datasets | Yes | We constructed Quick Draw, a dataset of 50 million vector drawings obtained from Quick, Draw! (Jongejan et al., 2016), an online game where the players are asked to draw objects belonging to a particular object class in less than 20 seconds. Quick Draw consists of hundreds of classes of common objects. Each class of Quick Draw is a dataset of 70K training samples, in addition to 2.5K validation and 2.5K test samples. ... 1The code and dataset is available at https://magenta.tensorflow.org/sketch_rnn. |
| Dataset Splits | Yes | Each class of Quick Draw is a dataset of 70K training samples, in addition to 2.5K validation and 2.5K test samples. |
| Hardware Specification | No | The paper describes model configurations and training parameters (e.g., RNN node counts, latent vector dimensions, batch sizes, learning rates) but does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using Adam optimizer and Hyper LSTM cells, but it does not specify version numbers for these or other general software components like Python, TensorFlow, or CUDA, which would be necessary for reproducible environment setup. |
| Experiment Setup | Yes | Our encoder and decoder RNNs consist of 512 and 2048 nodes respectively. In our model, we use M = 20 mixture components for the decoder RNN. The latent vector z has Nz = 128 dimensions. We apply Layer Normalization (Ba et al., 2016) to our model, and during training apply recurrent dropout [9] with a keep probability of 90%. We train the model with batch sizes of 100 samples, using Adam (Kingma & Ba, 2015) with a learning rate of 0.0001 and gradient clipping of 1.0. All models are trained with KLmin = 0.20, R = 0.99999. During training, we perform simple data augmentation by multiplying the offset columns ( x, y) by two IID random factors chosen uniformly between 0.90 and 1.10. Unless mentioned otherwise, all experiments are conducted with w KL = 1.00. |