Learning Representations without Compositional Assumptions

Authors: Tennison Liu, Jeroen Berrevoets, Zhaozhi Qian, Mihaela Van Der Schaar

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Empirical Investigations Having introduced the challenges of learning from multiview data ITW and our proposed method to address it, we now turn to quantitatively evaluating our method: 1. Learning ITW: What is the problem? Section 5.1 employs a simulation of ITW multi-view data to probe the performances of different compositional assumptions. ... 3. Performance: Does it work? Section 5.2 evaluates downstream performance of our method against state-ofthe-art benchmarks on real world dataset. 4. Gains: Why does it work? We deconstruct our method to investigate its sources of performance gain.
Researcher Affiliation Academia 1DAMTP, University of Cambridge, Cambridge, UK 2Alan Turing Institute, London, UK. Correspondence to: Tennison Liu <tl522@cam.ac.uk>.
Pseudocode No The paper describes its method using textual descriptions and mathematical equations, but it does not include any explicit pseudocode blocks or algorithm figures.
Open Source Code Yes Our implementation can be found at https://github.com/tennisonliu/ LEGATO and at the wider lab repository https:// github.com/vanderschaarlab/LEGATO.
Open Datasets Yes We evaluate our method on three real-world datasets. TCGA (Tomczak et al., 2015)... UK Biobank (Sudlow et al., 2015)... UCI-MFS (van Breukelen et al., 1998)... For constructing multiple views and labels, the following datasets were downloaded from http://gdac.broadinstitute.org: ... We used data from the UK Biobank (Sudlow et al., 2015)... The lung cancer dataset is extracted from UK Biobank using the scripts provided in https://github.com/callta/synthetic-data-analyses/tree/main/code...
Dataset Splits Yes All models are implemented in Py Torch. The data is split 60-20-20 into an unlabeled training set, labeled training set, and test set respectively, and all reported results are averaged over 10 runs, where different data splits are sampled for each run.
Hardware Specification Yes All experiments are run on an NVIDIA Tesla K40C GPU.
Software Dependencies No The paper states 'All models are implemented in Py Torch' but does not specify the version number of PyTorch or any other software dependencies.
Experiment Setup Yes For all experiments, we use batch size of 64, but tune the learning rate η {0.001, 0.01, 0.1} and weight decay {0.001, 0.01, 0.1}. ... Additionally, we employ early stopping to terminate model training after 20 epochs of no improvement on the validation set, after which the best model is returned for evaluation.