reproducibilityindex.ai

Explaining Latent Representations with a Corpus of Examples

Authors: Jonathan Crabbe, Zhaozhi Qian, Fergus Imrie, Mihaela van der Schaar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments on tasks ranging from mortality prediction to image classiﬁcation, we demonstrate that these decompositions are robust and accurate.
Researcher Affiliation	Academia	Jonathan Crabbé University of Cambridge jc2133@cam.ac.uk Zhaozhi Qian University of Cambridge zq224@maths.cam.ac.uk Fergus Imrie UCLA imrie@g.ucla.edu Mihaela van der Schaar University of Cambridge The Alan Turing Institute UCLA mv472@cam.ac.uk
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper. The methodology is described in prose and mathematical formulations.
Open Source Code	Yes	The code for our method and experiments is available on the Github repository https://github.com/Jonathan Crabbe/Simplex.
Open Datasets	Yes	We use two different datasets with distinct tasks for our experiment: (1) 240,486 patients enrolled in the American SEER program [31]. We consider the binary classiﬁcation task of predicting cancer mortality for patients with prostate cancer. (2) 70,000 MNIST images of handwritten digits [32].
Dataset Splits	No	We start with a dataset D that we split into a training set Dtrain and a testing set Dtest. We train and validate an MLP risk model with DUSA.
Hardware Specification	No	All the experiments have been replicated on different machines.
Software Dependencies	No	The paper mentions training a multilayer perceptron (MLP) and a convolutional neural network (CNN), implying software frameworks, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper describes general experimental settings such as using K corpus examples and adding an L1 penalty, but does not provide specific hyperparameter values like learning rates, batch sizes, number of epochs, or optimizer settings for the trained MLP or CNN models.