Learning to Generate with Memory
Authors: Chongxuan Li, Jun Zhu, Bo Zhang
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on several datasets demonstrate that memory can significantly boost the performance of DGMs on various tasks, including density estimation, image generation, and missing value imputation, and DGMs with memory can achieve state-of-the-art quantitative results. |
| Researcher Affiliation | Academia | Chongxuan Li LICX14@MAILS.TSINGHUA.EDU.CN Jun Zhu DCSZJ@MAIL.TSINGHUA.EDU.CN Bo Zhang DCSZB@MAIL.TSINGHUA.EDU.CN Dept. of Comp. Sci. & Tech., State Key Lab of Intell. Tech. & Sys., TNList Lab, Center for Bio-Inspired Computing Research, Tsinghua University, Beijing, 100084, China |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Source code at https://github.com/zhenxuan00/MEM_DGM |
| Open Datasets | Yes | The MNIST dataset (Lecun et al., 1998) consists of 50,000 training, 10,000 validation and 10,000 testing images of handwritten digits and each image is of 28 28 pixels. The OCR-letters dataset (Bache & Lichman, 2013) consists of 32,152 training, 10,000 validation and 10,000 testing letter images of size 16 8 pixels. |
| Dataset Splits | Yes | The MNIST dataset (Lecun et al., 1998) consists of 50,000 training, 10,000 validation and 10,000 testing images of handwritten digits and each image is of 28 28 pixels. The OCR-letters dataset (Bache & Lichman, 2013) consists of 32,152 training, 10,000 validation and 10,000 testing letter images of size 16 8 pixels. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) used for running experiments were found in the paper. |
| Software Dependencies | No | The paper mentions 'Theano (Bastien et al., 2012)' and 'ADAM (Kingma & Ba, 2015)' but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We use ADAM (Kingma & Ba, 2015) in all experiments with parameters β1 = 0.9, β2 = 0.999 (decay rates of moving averages) and ϵ = 10 4 (a constant that prevents overflow). As a default, the global learning rate is fixed as 10 3 for 1,000 epochs and annealed by a factor 0.998 for 2,000 epochs with minibatch size 100. |