reproducibilityindex.ai

Stochastic Gradient Geodesic MCMC Methods

Authors: Chang Liu, Jun Zhu, Yang Song

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Synthetic experiments show the validity of the method, and its application to the challenging inference for spherical topic models indicate practical usability and efﬁciency. We present empirical results on both synthetic and real datasets to prove the accuracy and efﬁciency of our methods.
Researcher Affiliation	Academia	Dept. of Comp. Sci. & Tech., TNList Lab; Center for Bio-Inspired Computing Research State Key Lab for Intell. Tech. & Systems, Tsinghua University, Beijing, China Dept. of Physics, Tsinghua University, Beijing, China {chang-li14@mails, dcszj@}.tsinghua.edu.cn; songyang@stanford.edu JZ is the corresponding author; YS is with Department of Computer Science, Stanford University, CA.
Pseudocode	Yes	Algorithms of SGGMC and g SGNHT are listed in Appendix E.
Open Source Code	Yes	All the codes and data can be found at http://ml.cs.tsinghua.edu.cn/~changliu/sggmcmc-sam/.
Open Datasets	Yes	We apply SGGMC/g SGNHT to solve the challenging task of posterior inference in Spherical Admixture Model (SAM) [24]. The small dataset is the 20News-different dataset used by [24], which consists of 3 categories from 20Newsgroups dataset. For the large dataset, we use a subset of the Wikipedia dataset with 150K training and 1K test documents, to challenge the scalability of all the methods.
Dataset Splits	No	The paper specifies training and test document counts for the datasets but does not explicitly mention a validation split.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for experiments. It only mentions 'All sampling methods are implemented 5 in C++ and fairly parallelized by Open MP'.
Software Dependencies	No	The paper mentions 'implemented in C++' and 'Open MP' and that 'VI/Sto VI are run in MATLAB codes by [24]'. However, it does not specify version numbers for any libraries or specific software components.
Experiment Setup	Yes	The stochastic gradient is produced by corrupting with N(0, 1000I), whose variance is used as V (x) in Eqn. (8) for sampling. SGGMC uses empirical Fisher information in the way of [2] for V (x) in Eqn. (8), and uses 10 for batch size. Hyper-parameters of SAM are ﬁxed while training and set the same for all methods. V (x) in Eqn. (8) is taken zero for SGGMC/g SGNHT. We use 20 topics and 50 as the batch size. We use 50 topics and 100 as the batch size.