reproducibilityindex.ai

Learning Representations without Compositional Assumptions

Authors: Tennison Liu, Jeroen Berrevoets, Zhaozhi Qian, Mihaela Van Der Schaar

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Empirical Investigations Having introduced the challenges of learning from multiview data ITW and our proposed method to address it, we now turn to quantitatively evaluating our method: 1. Learning ITW: What is the problem? Section 5.1 employs a simulation of ITW multi-view data to probe the performances of different compositional assumptions. ... 3. Performance: Does it work? Section 5.2 evaluates downstream performance of our method against state-ofthe-art benchmarks on real world dataset. 4. Gains: Why does it work? We deconstruct our method to investigate its sources of performance gain.
Researcher Affiliation	Academia	1DAMTP, University of Cambridge, Cambridge, UK 2Alan Turing Institute, London, UK. Correspondence to: Tennison Liu <tl522@cam.ac.uk>.
Pseudocode	No	The paper describes its method using textual descriptions and mathematical equations, but it does not include any explicit pseudocode blocks or algorithm figures.
Open Source Code	Yes	Our implementation can be found at https://github.com/tennisonliu/ LEGATO and at the wider lab repository https:// github.com/vanderschaarlab/LEGATO.
Open Datasets	Yes	We evaluate our method on three real-world datasets. TCGA (Tomczak et al., 2015)... UK Biobank (Sudlow et al., 2015)... UCI-MFS (van Breukelen et al., 1998)... For constructing multiple views and labels, the following datasets were downloaded from http://gdac.broadinstitute.org: ... We used data from the UK Biobank (Sudlow et al., 2015)... The lung cancer dataset is extracted from UK Biobank using the scripts provided in https://github.com/callta/synthetic-data-analyses/tree/main/code...
Dataset Splits	Yes	All models are implemented in Py Torch. The data is split 60-20-20 into an unlabeled training set, labeled training set, and test set respectively, and all reported results are averaged over 10 runs, where different data splits are sampled for each run.
Hardware Specification	Yes	All experiments are run on an NVIDIA Tesla K40C GPU.
Software Dependencies	No	The paper states 'All models are implemented in Py Torch' but does not specify the version number of PyTorch or any other software dependencies.
Experiment Setup	Yes	For all experiments, we use batch size of 64, but tune the learning rate η {0.001, 0.01, 0.1} and weight decay {0.001, 0.01, 0.1}. ... Additionally, we employ early stopping to terminate model training after 20 epochs of no improvement on the validation set, after which the best model is returned for evaluation.