Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hierarchical Methods of Moments
Authors: Matteo Ruffini, Guillaume Rabusseau, Borja Balle
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality. |
| Researcher Affiliation | Collaboration | Matteo Ruffini Universitat Politècnica de Catalunya Guillaume Rabusseau Mc Gill University Borja Balle Amazon Research |
| Pseudocode | Yes | Algorithm 1 SIDIWO: Simultaneous Diagonalization based on Whitening and Optimization; Algorithm 2 Splitting a corpus into two parts |
| Open Source Code | Yes | The implementation of the described algorithms can be found a this link: https://github.com/mruffini/Hierarchical-Methods-of-Moments. |
| Open Datasets | Yes | We consider the full set of NIPS papers accepted between 1987 and 2015, containing n = 11, 463 papers [28]. [28] Valerio Perrone, Paul A Jenkins, Dario Spano, and Yee Whye Teh. Poisson random fields for dynamic feature models. ar Xiv preprint ar Xiv:1611.07460, 2016. |
| Dataset Splits | No | The paper describes generating synthetic data and processing real-world datasets for unsupervised hierarchical clustering, but it does not specify explicit training, validation, and test splits for model training or evaluation in the traditional sense. |
| Hardware Specification | Yes | All the experiments were run on a Mac Book Pro with an Intel Core i5 processor. |
| Software Dependencies | No | The experiments were performed in Python 2.7, using numpy library for linear algebra operations, with the exception of the implementation of the method from [22], for which we used the author s Matlab implementation. While software is mentioned, specific version numbers for numpy or Matlab are not provided. |
| Experiment Setup | Yes | We generate 400 samples according to this model and we iteratively run Algorithm 2 to create a hierarchical binary tree with 8 leafs. ...we keep the d = 3000 most frequent words as vocabulary and we iteratively run Algorithm 2 to create a binary tree of depth 4. ...relevance [29] of a word w 2 Cnode C is defined by r(w, Cnode) = λ log P[w|Cnode] + (1 λ) log P[w|C]) where the weight parameter is set to λ = 0.7 |