reproducibilityindex.ai

Mode Estimation for High Dimensional Discrete Tree Graphical Models

Authors: Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate the accuracy and efﬁciency of our algorithm. More theoretical guarantee of our algorithm can be found in [7]. To validate our method, we ﬁrst show the scalability and accuracy of our algorithm in synthetic data. Furthermore, we demonstrate using biological data how modes can be used as a novel analysis tool.
Researcher Affiliation	Academia	Chao Chen Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 chao.chen.cchen@gmail.com Han Liu Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 hanliu@princeton.edu Dimitris N. Metaxas Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 dnm@cs.rutgers.edu Tianqi Zhao Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 tianqi@princeton.edu
Pseudocode	Yes	Procedure 1 Compute-M-Modes Input: A tree G, a potential function f and a scale δ Output: The M modes of the lowest potential 1: Construct geodesic balls B = {Br(c) \| c V}, where r = δ 2 + 1 2: for all B B do 3: Mδ B = the set of local modes of B 4: Construct a junction tree (Figure 2). The label set of each supernode is its local modes. 5: Compute the M lowest-potential labelings of the junction tree, using Nilsson s algorithm.
Open Source Code	No	The paper does not provide any link to source code nor states its public availability for the described methodology.
Open Datasets	Yes	Biological data analysis. We compute modes of the microarray data of Arabidopsis thaliana plant (108 samples, 39 dimensions) [24].
Dataset Splits	No	The paper mentions generating synthetic data and using different sample sizes (10K, 40K, 80K) for evaluation, and also uses a biological dataset, but it does not specify explicit training, validation, or test dataset splits.
Hardware Specification	No	The paper discusses running time and scalability, but does not provide any specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies	No	The paper describes algorithms and methods but does not list any specific software dependencies with version numbers.
Experiment Setup	Yes	In all experiments, we choose M to be 500. We randomly generate tree-structured graphical model (tree size D =200 ...2000, label size L = 3) and test the speed. We randomly generate tree-structured distributions (D = 20, L = 2). To evaluate the sensitivity of our method to noise, we randomly ﬂip 0%, 5%, 10%, 15% and 20% labels of these samples.