Mode Estimation for High Dimensional Discrete Tree Graphical Models
Authors: Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the accuracy and efficiency of our algorithm. More theoretical guarantee of our algorithm can be found in [7]. To validate our method, we first show the scalability and accuracy of our algorithm in synthetic data. Furthermore, we demonstrate using biological data how modes can be used as a novel analysis tool. |
| Researcher Affiliation | Academia | Chao Chen Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 chao.chen.cchen@gmail.com Han Liu Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 hanliu@princeton.edu Dimitris N. Metaxas Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 dnm@cs.rutgers.edu Tianqi Zhao Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 tianqi@princeton.edu |
| Pseudocode | Yes | Procedure 1 Compute-M-Modes Input: A tree G, a potential function f and a scale δ Output: The M modes of the lowest potential 1: Construct geodesic balls B = {Br(c) | c V}, where r = δ 2 + 1 2: for all B B do 3: Mδ B = the set of local modes of B 4: Construct a junction tree (Figure 2). The label set of each supernode is its local modes. 5: Compute the M lowest-potential labelings of the junction tree, using Nilsson s algorithm. |
| Open Source Code | No | The paper does not provide any link to source code nor states its public availability for the described methodology. |
| Open Datasets | Yes | Biological data analysis. We compute modes of the microarray data of Arabidopsis thaliana plant (108 samples, 39 dimensions) [24]. |
| Dataset Splits | No | The paper mentions generating synthetic data and using different sample sizes (10K, 40K, 80K) for evaluation, and also uses a biological dataset, but it does not specify explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper discusses running time and scalability, but does not provide any specific hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper describes algorithms and methods but does not list any specific software dependencies with version numbers. |
| Experiment Setup | Yes | In all experiments, we choose M to be 500. We randomly generate tree-structured graphical model (tree size D =200 ...2000, label size L = 3) and test the speed. We randomly generate tree-structured distributions (D = 20, L = 2). To evaluate the sensitivity of our method to noise, we randomly flip 0%, 5%, 10%, 15% and 20% labels of these samples. |