Robust Learning with the Hilbert-Schmidt Independence Criterion

Authors: Daniel Greenfeld, Uri Shalit

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on unsupervised covariate shift tasks demonstrate that models learned with the proposed loss-function outperform models learned with standard loss functions, achieving state-of-the-art results on a challenging cell-microscopy unsupervised covariate shift task. We provide experimental validation using both linear models and deep networks, showing that learning with the HSIC-loss is competitive on a variety of unsupervised covariate shift benchmarks.
Researcher Affiliation Academia 1Technion Institute of Technology, Haifa, Israe. Correspondence to: Daniel Greenfeld <danielgreenfeld3@gmail.com>, Uri Shalit <urishalit@technion.ac.il>.
Pseudocode Yes In Algorithm 1 we present a general gradient-based method for learning with this loss. Algorithm 1 Learning with HSIC-loss
Open Source Code Yes We provide code, including a Py Torch (Paszke et al., 2019) class for the HSIC-loss1. 1https://github.com/danielgreenfeld3/XIC.
Open Datasets Yes We experiment with fitting a linear model... We used the bike sharing dataset by Fanaee-T and Gama (2014) from the UCI repository... In this experiment we test the performance of models trained on the MNIST dataset by Le Cun et al. (1998)... In the last experiment, we test our approach on the cell out of sample dataset introduced by Lu et al. (2019).
Dataset Splits Yes Training was done for 50 epochs on 80% of the SOURCE dataset, and the final model was chosen according to the remaining 20% used as a validation set. We ran 100 experiments, each of them was done by randomly sub-sampling 80% of the SOURCE set and 80% of the TARGET set, thus obtaining a standard error estimate of the mean.
Hardware Specification No The paper describes the models trained and the datasets used but does not specify any hardware details like GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions 'Py Torch (Paszke et al., 2019)' and 'automatic differentiation software (Paszke et al., 2019; Abadi et al., 2015)' for implementing the loss, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes When training the models with HSIC-loss, we used batch-size of 32, and optimized using Adam optimizer (Kingma and Ba, 2014). The kernels we chose were radial basis function kernels, with γ = 1 for both covariates and residuals kernels. Training was done for 50 epochs on 80% of the SOURCE dataset... The optimization was done with Adam (Kingma and Ba, 2014), with batch size of 128, and exponential decay of the learning rate was used when training with cross-entropy loss.