Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation

Authors: Ilan Naiman, Nimrod Berman, Omri Azencot

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on video, audio, and time series benchmarks. Our method presents state-of-the-art results in comparison to existing techniques.
Researcher Affiliation Academia 1Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel.
Pseudocode Yes Algorithm 1 Static predictive sampling trick
Open Source Code Yes The code is available at Git Hub.
Open Datasets Yes Sprites. A dataset introduced by (Reed et al., 2015)... MUG. A Facial expression dataset created by (Aifanti et al., 2010)... TIMIT. A dataset introduced by (Garofolo et al., 1992)... Jester. A dataset introduced by (Materzynska et al., 2019)... Letters. The Letters dataset (Ibrahim et al., 2019)... Physionet. The Physionet ICU Dataset (Goldberger et al., 2000)... Air Quality. The UCI Beijing Multi-site Air Quality dataset (Zhang et al., 2017)...
Dataset Splits No The paper specifies training and testing splits for datasets, and describes splitting the test set for downstream tasks, but does not explicitly detail a separate validation set split used for tuning the main model during training.
Hardware Specification No The paper does not specify any hardware details like GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No All the models have been implemented using Pytorch (Paszke et al., 2019). While PyTorch is mentioned, a specific version number for PyTorch or other software dependencies like Python or CUDA is not provided.
Experiment Setup Yes The hyperparameter λ1 is tuned over {1, 2.5, 5, 10}, λ2 is tuned over {1, 3, 5, 7, 9}, and λ4 and λ5 are tuned over {0.1, 0.5, 1, 2.5, 5} while λ3 is fixed to 1. We used Adam optimizer (Kingma & Ba, 2014) with the learning rate chosen from {0.001, 0.0015, 0.002}. The static and dynamic features dimensions are selected from {128, 256} and {32, 64}, respectively.