reproducibilityindex.ai

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

Authors: Dylan Sam, J. Zico Kolter

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that Lo L improves upon existing weak supervision methods on several benchmark text and image classification tasks and further demonstrate that incorporating gradient information leads to better performance on almost every task.
Researcher Affiliation	Collaboration	Dylan Sam1, J. Zico Kolter1,2 1 Machine Learning Department, Carnegie Mellon University 2 Bosch Center for Artificial Intelligence dylansam@andrew.cmu.edu, zkolter@cs.cmu.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code for our experiments can be found here1. 1https://github.com/dsam99/Lo L
Open Datasets	Yes	We compare Lo L to existing weakly supervised algorithms on 5 text classification datasets from WRENCH (Zhang et al. 2021)... We extend our setting to consider 3 image classification tasks from the Animals with Attributes 2 dataset (Xian et al. 2018).
Dataset Splits	Yes	For each task, we split the dataset into 80% train and validation data and 20% test data. Then we further split training and validation data into N examples per class of labeled validation data. We report results for validation set sizes of N {10, 15, 20, 50, 100}.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for the experiments. It only mentions general concepts like neural networks.
Software Dependencies	No	The paper does not specify versions for any software dependencies (e.g., Python, PyTorch, TensorFlow, scikit-learn). It only implicitly refers to programming environments.
Experiment Setup	Yes	At a high level, this loss function incorporates a squared penalty for the gradient of our model being less than c times the gradient of the heuristic (along non-abstained dimensions). α serves as a hyperparameter that determines the weighting or importance of the gradient matching term, similar to a weighting parameter for regularization. ... and c > 0. ... In these experiments, we train methods for 10 epochs.