Distance-Based Network Recovery under Feature Correlation

Authors: David Adametz, Volker Roth

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments We first look at synthetic data and compare how well the recovered network matches the true one. Hereby, the accuracy is measured by the f-score using the edges (positive/negative/zero). and 4.2 Real-World Data: A Network of Biological Pathways In order to demonstrate the scalability of Ti MT, we apply it to the publicly available colon cancer dataset of Sheffer et al. [20]
Researcher Affiliation Academia Department of Mathematics and Computer Science University of Basel, Switzerland
Pseudocode Yes Algorithm 1 One loop of the MCMC sampler
Open Source Code No The paper does not provide a link to the source code or an explicit statement about its release.
Open Datasets Yes we apply it to the publicly available colon cancer dataset of Sheffer et al. [20]
Dataset Splits No The paper does not explicitly specify training/validation/test dataset splits, specific percentages, or absolute sample counts for each split needed to reproduce the experiment.
Hardware Specification Yes Runtime on a standard 3 GHz PC was 3:10 hours for Ti MT
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes Hyperparameters α, β and d At some point in every Bayesian analysis, all hyperparameters need to be specified in a sensible manner. Currently, the occurrence of d in Eq. (9) is particularly problematic, since (i) the number of latent features is unknown and (ii) it critically affects the balance between determinants. To resolve this issue, recall that α must satisfy α > 1 2(d 1), which can alternatively be expressed as α = 1 2(vd n + 1) with v > 1 + n 2 d . Thereby, we arrive at ℓ(W ; v, β, D, 1n) = d 2 log |W| d 2 log(1 n W1n) vd 2 log |In β 4 WQD|, (10) where d now influences the likelihood on a global level and can be used as temperature reminiscent of simulated annealing techniques for optimization. In more detail, we initialize the MCMC sampler with a small value of d and increase it slowly, until the acceptance ratio is below, say, 1 percent. After that event, all samples of W are averaged to obtain the final network.