Target-Embedding Autoencoders for Supervised Representation Learning
Authors: Daniel Jarrett, Mihaela van der Schaar
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As our empirical contribution, we extend validation of this approach beyond existing static classification applications to multivariate sequence forecasting, verifying their advantage on both linear and nonlinear recurrent architectures thereby underscoring the further generality of this framework beyond feedforward instantiations. In this section, we take the first step in answering this question as our empirical contribution, we extend validation of target-embedding autoencoders to the domain of multivariate sequence forecasting, exploring its utility on linear and nonlinear sequence-to-sequence architectures. |
| Researcher Affiliation | Academia | Daniel Jarrett Department of Mathematics University of Cambridge, UK daniel.jarrett@maths.cam.ac.uk Mihaela van der Schaar University of Cambridge, UK University of California, Los Angeles, USA mv472@cam.ac.uk, mihaela@ee.ucla.edu |
| Pseudocode | Yes | Figure 2 gives block diagrams of component functions and objectives in (a) FEAs and (b) TEAs during training (see Algorithm 1 in Appendix C for pseudocode). |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use three datasets in our experiments. The first consists of a cohort of patients enrolled in the UK Cystic Fibrosis registry (UKCF)... The second consists of patients in the Alzheimer s Disease Neuroimaging Initiative study (ADNI)... The third consists of a cohort of patients in intensive care units from the Medical Information Mart for Intensive Care (MIMIC)... We thank the UK Cystic Fibrosis Trust, the Alzheimer s Disease Neuroimaging Initiative, and the MIT Lab for Computational Physiology respectively for making the UKCF, ADNI, and MIMIC datasets available for research. |
| Dataset Splits | Yes | We use cross-validation on the training set for hyperparameter tuning, selecting the setting that gives the lowest validation loss averaged across folds. For each model and dataset, we report the average and standard error of each performance metric across 10 different experiment runs, each with a different random train-test split. For hyperparameter tuning (ζ, ψ, ν, Ns), we use cross-validation on the training set using 20 iterations of random search, selecting the setting that gives the lowest validation loss averaged across folds. |
| Hardware Specification | No | The paper states 'All models are implemented in Tensorflow' but does not specify any particular hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper states 'All models are implemented in Tensorflow' but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | Training is performed using the ADAM optimizer with a learning rate of ψ {3e 5, 3e 4, 3e 3, 3e 2}. Models are trained until convergence up to a maximum of 10,000 iterations with a minibatch size of Ns {32, 64, 128}; the empirical loss is computed on the validation set every 50 iterations of training, and convergence is determined on the basis of that error. Checkpointing is implemented every 50 iterations, and the best model parameters are restored (upon convergence) for use on the testing set. For all models except Base , we allow the opportunity to select among the ℓ2-regularization coefficients ν {0, 3e 5, 3e 4, 3e 3, 3e 2}. We set the strength-of-prior coefficient λ = 0.5 for FEA, F/TEA, as well as all variants of TEA (however, we do provide sensitivities on λ for TEA in our experiments). |