Universalizing Weak Supervision

Authors: Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Carl Roberts, Frederic Sala

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we validate our framework and show improvement over baselines in diverse settings including real-world learning-to-rank and regression problems along with learning on hyperbolic manifolds. Experimentally, we demonstrate our approach on five choices of problems never before tackled in WS: Learning rankings: on two real-world rankings tasks, our approach with as few as five sources performs better than supervised learning with a smaller number of true labels. In contrast, an adaptation of the Snorkel (Ratner et al., 2018) framework cannot reach this performance with as many as 18 sources. Regression: on two real-world regression datasets, when using 6 or more labeling function, the performance of our approach is comparable to fully-supervised models. Learning in hyperbolic spaces: on a geodesic regression task in hyperbolic space, we consistently outperform fully-supervised learning, even when using only 3 labeling functions (LFs). Estimation in generic metric spaces: in a synthetic setting of metric spaces induced by random graphs, we demonstrate that our method handles LF heterogeneity better than the majority vote baseline. Learning parse trees: in semantic dependency parsing, we outperform strong baseline models.
Researcher Affiliation Academia Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala Department of Computer Sciences University of Wisconsin-Madison {cshin23, wli525, hvishwakarma, ncroberts2, fsala}@wisc.edu
Pseudocode Yes Algorithm 1: Universal Label Model Learning; Algorithm 2: CONTINUOUSTRIPLETS; Algorithm 3: Isotropic Gaussian Label Model Learning; Algorithm 4: QUADRATICTRIPLETS
Open Source Code No The paper does not contain an explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes For our movies dataset, we combined IMDb, TMDb, Rotten Tomatoes, and Movie Lens movie review data to obtain features and weak labels. We used real-world datasets compatible with multiple label types, including a movies dataset and the Board Game Geek dataset (2017) (BGG), along with synthetic data. We used datasets on Czech and English taken from the Universal Dependencies Nivre et al. (2020) repository. MSLR-WEB10K. https://www.microsoft.com/en-us/research/project/mslr/. Imdb movie dataset. https://www.imdb.com/interfaces/. Tmdb 5k movie dataset version 2. https://www.kaggle.com/tmdb/tmdb-movie-metadata. Board Game Geek Reviews Version 2. https://www.kaggle.com/jvanelteren/boardgamegeek-reviews, 2017.
Dataset Splits No The paper specifies a 'training set' and 'test set' split (e.g., '75% for training set, and 25% for the test set' or '5000 sets of movies as the training set, and 1000 sets of movies as the test set'), but does not explicitly mention a separate validation set or its proportion/size.
Hardware Specification Yes All experiments were conducted on a machine with Intel Broadwell 2.7GHz CPU and NVIDIA GK210 GPU.
Software Dependencies No The paper mentions software components like 'SGD optimizer', 'List MLE loss', and 'gradient boosting regression implemented in sklearn' but does not provide specific version numbers for these software libraries or frameworks.
Experiment Setup Yes In the ranking setup, we used 4-layer MLP with Re LU activations. Each hidden layer had 30 units and batch normalization(Ioffe & Szegedy, 2015) was applied for all hidden layers. We used the SGD optimizer with List MLE loss (Xia et al., 2008); the learning rate was 0.01. In the regression experiments, we used gradient boosting regression implemented in sklearn with n estimators=250. Other than n estimators, we used the default hyperparameters in sklearn s implementation.