ChiroDiff: Modelling chirographic data with Diffusion Models

Authors: Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform quantitative and qualitative evaluation of our framework on relevant datasets and found it to be better or on par with competing approaches.
Researcher Affiliation Collaboration Ayan Das1,2, Yongxin Yang1,3, Timothy Hospedales1,4,5, Tao Xiang1,2 & Yi-Zhe Song1,2 1Sketch X, CVSSP, University of Surrey; 2i Fly Tek-Surrey Joint Research Centre on AI; 3Queen Mary University of London; 4University of Edinburgh, 5Samsung AI Centre, Cambridge
Pseudocode No The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Please refer to the project page3 for full source code. 3Our project page: https://ayandas.me/chirodiff
Open Datasets Yes Vector MNIST or VMNIST (Das et al., 2022), Kanji VG1 is a vector dataset containing Kanji characters. We use a preprocessed version of the dataset2 which converted the original SVGs into polyline sequences. 1Original Kanji VG: kanjivg.tagaini.net 2Pre-processed Kanji VG: github.com/hardmaru/sketch-rnn-datasets/tree/master/kanji
Dataset Splits Yes We use 80-10-10 splits for our all our experimentation.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions using ‘Adam W optimizer’ but does not specify other key software components (e.g., deep learning framework, programming language) with version numbers.
Experiment Setup Yes CHIRODIFF s forward process, just like traditional DDPMs, uses a linear noising schedule of βmin = 10 4 1000/T, βmax = 2 10 2 1000/T... we choose a standard value of T = 1000. ... We use a 2-layer GRU with D = 48 hidden units for VMNIST and 3-layer GRU for Quick Draw (D = 128) and Kanji VG (D = 96). We trained all of our models by minimizing Eq. 3 using Adam W optimizer... and step-wise LR scheduling of γe = 0.9997 γe 1 at every epoch e where γ0 = 6 10 3. ...We found σ2 t = 0.8 βt to work well in majority of the cases...