reproducibilityindex.ai

Learning to Discover Sparse Graphical Models

Authors: Eugene Belilovsky, Kyle Kastner, Gael Varoquaux, Matthew B. Blaschko

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluations focus on the challenging high dimensional settings in which p > n and consider both synthetic data and real data from genetics and neuroimaging.
Researcher Affiliation	Academia	Eugene Belilovsky INRIA Galen University of Paris-Saclay, France eugene.belilovsky@inria.frKyle Kastner MILA Lab University of Montreal, Canada kyle.kastner@umontreal.caGael Varoquaux INRIA Parietal Saclay, France gael.varoquaux@inria.frMatthew B. Blaschko Center for Processing Speech and Images KU Leuven, Belgium matthew.blaschko@esat.kuleuven.be
Pseudocode	Yes	Algorithm 1 Training a GGM edge estimator for i {1, .., N} do Sample Gi P(G) Sample Σi P(Σ\|G = Gi) Xi {xj N(0, Σi)}n j=1 Construct (Yi, ˆΣi) pair from (Gi, Xi) end for Select Function Class F (e.g. CNN) Optimize: min f F 1 N PN k=1 ˆl(f(ˆΣk), Yk))
Open Source Code	No	The paper does not contain an explicit statement about releasing its source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We use the ABIDE dataset (Di Martino et al, 2014), a large scale resting state f MRI dataset. It gathers brain scans from 539 individuals suffering from autism spectrum disorder and 573 controls over 16 sites.
Dataset Splits	No	Each network is trained continously with new samples generated until the validation error saturates. For a given precision matrix we generate 5 possible X samples to be used as training data, with a total of approximately 100K training samples used for each network. The networks are optimized using ADAM (Kingma & Ba, 2015) coupled with cross-entropy loss as the objective function (cf. Sec. 2.1). We use batch normalization at each layer. Additionally, we found that using the absolute value of the true partial correlations as labels, instead of hard binary labels, improves results.
Hardware Specification	No	We compute the average execution time of our method compared to Graph Lasso and BDGraph on a CPU in Table 4.
Software Dependencies	No	We compared our learned estimator against the scikit-learn (Pedregosa et al, 2011) implementation of Graphical Lasso... We used the BDGraph R-package... as well as the R-package rags2ridges (Peeters et al., 2015).
Experiment Setup	Yes	We train networks taking in 39, 50, and 500 node graphs. ... In all cases we have 50 feature maps of 3x3 kernels. The 39 and 50 node network with 6 convolutional layers and dk = k + 1. For the 500 node network with 8 convolutional layers and dk = 2k+1. We use Re LU activations. The last layer has 1x1 convolution and a sigmoid outputing a value of 0 to 1 for each edge. ... The networks are optimized using ADAM (Kingma & Ba, 2015) coupled with cross-entropy loss as the objective function (cf. Sec. 2.1). We use batch normalization at each layer.