reproducibilityindex.ai

Understanding The Robustness of Self-supervised Learning Through Topic Modeling

Authors: Zeping Luo, Shiyou Wu, Cindy Weng, Mo Zhou, Rong Ge

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that the same objectives can perform on par with posterior inference using the correct model, while outperforming posterior inference using misspeciﬁed models. To verify our theory, we run synthetic experiments to show that self-supervised learning indeed outperforms inferencing with misspeciﬁed models.
Researcher Affiliation	Academia	Duke University, USA {zeping.luo,shiyou.wu,cindy.weng}@duke.edu, {mozhou,rongge}@cs.duke.edu
Pseudocode	No	The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	No	The paper does not include an explicit statement about releasing its source code or a direct link to a code repository for the methodology described.
Open Datasets	Yes	We use the AG news dataset (Zhang et al., 2015), IMDB movie review sentiment classiﬁcation dataset by Maas et al. (2011) and the Small DBpedia ontology dataset... by Zhang et al. (2015).
Dataset Splits	Yes	Each category has 30,000 samples in the training set and 19,000 samples in the testing set. In the supervised phase, ... we train a linear classiﬁer ... with 3-fold cross validation for parameter tuning (l2 regularization term and solver). To split the data set into unsupervised dataset and supervised dataset, we selected a random sample of 1000 documents as labeled supervised dataset for each category of documents, while the remaining 116,000 documents fall into unsupervised dataset for representation learning.
Hardware Specification	Yes	We run our experiments on 2080RTXTi/V100s.
Software Dependencies	No	The paper mentions using 'Py MC3 package', 'Gensim library', and 'scikit-learn' but does not specify their version numbers, which is required for reproducibility.
Experiment Setup	Yes	In our simulation, we set K = 20, V = 5000, λ {30, 60}. During training, we resample 60,000 new documents after every 2 epochs. The network used for Figure 2 has 3 residual blocks and a width of 4096 neurons per layer. During training, we use AMSGrad optimizer and an initial learning rate of 0.0001 or 0.0002.