reproducibilityindex.ai

Diet Networks: Thin Parameters for Fat Genomics

Authors: Adriana Romero, Pierre Luc Carrier, Akram Erraqabi, Tristan Sylvain, Alex Auvolat, Etienne Dejoie, Marc-André Legault, Marie-Pierre Dubé, Julie G. Hussin, Yoshua Bengio

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show experimentally on a population stratiﬁcation task of interest to medical studies that the proposed approach can signiﬁcantly reduce both the number of parameters and the error rate of the classiﬁer.
Researcher Affiliation	Academia	Adriana Romero , Pierre Luc Carrier , Akram Erraqabi, Tristan Sylvain, Alex Auvolat, Etienne Dejoie Montreal Institute for Learning Algorithms Montreal, Quebec, Canada first Name.last Name@umontreal.ca, except adriana.romero.soriano@umontreal.ca and pierre-luc.carrier@umontreal.ca Marc-Andr e Legault1, Marie-Pierre Dub e1,2,3 1University of Montreal, Faculty of Medicine 2Montreal Heart Institute, 3Beaulieu-Saucier Pharmacogenomics Centre Montreal, Quebec, Canada marc-andre.legault.1@umontreal.ca marie-pierre.dube@umontreal.ca Julie G. Hussin Wellcome Trust Centre for Human Genetics University of Oxford Oxford, UK julieh@molbiol.ox.ac.uk Yoshua Bengio Montreal Institute for Learning Algorithms Montreal, Quebec, Canada yoshua.umontreal@gmail.com
Pseudocode	No	The paper does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code to reproduce the experiments can be found here: https://github.com/adri-romsor/Diet Networks
Open Datasets	Yes	We evaluate our method on a publicly available dataset for ancestry prediction, the 1000 Genomes dataset1, that best represents the human population diversity. 1http://www.internationalgenome.org/
Dataset Splits	Yes	We split the data into 5 folds of equal size. A single fold is retained for test, whereas three of the remaining folds are used as training data and the ﬁnal fold is used as validation data. We repeated the process 5 times (one per fold) and report the means and standard deviations of results on the different test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	Yes	The authors would like to thank the developers of Theano Theano Development Team (2016) and Lasagne Lasagne (2016).
Experiment Setup	Yes	We designed a basic architecture with 2 hidden layers followed by a softmax layer to perform ancestry prediction. All hidden layers have 100 units. All models were trained by means of stochastic gradient descent with adaptive learning rate (Tieleman & Hinton, 2012), both for γ = 0 and γ = 10, using dropout, limiting the norm of the weights to 1 and/or applying weight decay to reduce overﬁtting.