reproducibilityindex.ai

Estimating High Order Gradients of the Data Distribution by Denoising

Authors: Chenlin Meng, Yang Song, Wenzhe Li, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate empirically that models trained with the proposed method can approximate second order derivatives more efﬁciently and accurately than via automatic differentiation. Our experiments show that models learned with the proposed objective can approximate second order scores more accurately than applying automatic differentiation to lower order score models. Our approach is also more computationally efﬁcient for high dimensional data, achieving up to 500ˆ speedups for second order score estimation on MNIST.
Researcher Affiliation	Academia	Chenlin Meng Stanford University chenlin@cs.stanford.edu Yang Song Stanford University yangsong@cs.stanford.edu Wenzhe Li Tsinghua University lwz21@mails.tsinghua.edu.cn Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code, such as a specific repository link or an explicit code release statement.
Open Datasets	Yes	Our approach is also more computationally efﬁcient for high dimensional data, achieving up to 500ˆ speedups for second order score estimation on MNIST. We visualize the diagonal of the estimated Covrx \| xs for MNIST and CIFAR-10 [10] in Fig. 3.
Dataset Splits	No	The paper mentions 'test samples' and 'MNIST test set' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing needed to reproduce the data partitioning.
Hardware Specification	Yes	We report the wall-clock time averaged in 7 runs used for estimating second order scores during test time on a TITAN Xp GPU in Table 2.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup	Yes	We parameterize s1 and s2 with the same model architecture and use a batch size of 10 for both settings. We search the optimal step size for each method and observe that Ozaki sampling can use a larger step size and converge faster than Langevin dynamics (see Fig. 5). Ljointp q LD2SMp q γ LDSMp q, (14) where LDSMp q is deﬁned in Eq. (3) and γ P R 0 is a tunable coefﬁcient.