A Theoretical Analysis of Contrastive Unsupervised Representation Learning

Authors: Nikunj Saunshi, Orestis Plevrakis, Sanjeev Arora, Mikhail Khodak, Hrishikesh Khandeparkar

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct controlled experiments in both the text and image domains to support the theory. Section 8 describes experimental verification and support for our framework.
Researcher Affiliation Academia 1Princeton University, Princeton, New Jersey, USA. 2Institute for Advanced Study, Princeton, New Jersey, USA. 3Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets Yes Datasets: Two datasets were used in the controlled experiments. (1) The CIFAR-100 dataset (Krizhevsky, 2009)... (2) Lacking an appropriate NLP dataset with large number of classes, we create the Wiki-3029 dataset... we also use the unsupervised part of the IMDb review corpus (Maas et al., 2011)
Dataset Splits Yes The CIFAR-100 dataset (Krizhevsky, 2009) consisting of 32x32 images categorized into 100 classes with a 50000/10000 train/test split. The train/dev/test split is 70%/10%/20%.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup No The paper mentions using a 'GRU architecture' but does not provide specific experimental setup details such as hyperparameter values, optimizer settings, or training configurations in the main text.