Open-World Semi-Supervised Learning

Authors: Kaidi Cao, Maria Brbic, Jure Leskovec

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on image classification datasets and a single-cell annotation dataset demonstrate that ORCA consistently outperforms alternative baselines, achieving 25% improvement on seen and 96% improvement on novel classes of the Image Net dataset.
Researcher Affiliation Academia Kaidi Cao , Maria Brbi c , Jure Leskovec Department of Computer Science Stanford University {kaidicao, mbrbic, jure}@cs.stanford.edu
Pseudocode Yes Algorithm 1 ORCA: Open-wo Rld with un Certainty based Adaptive margin Require: Labeled subset Dl = {(xi, yi)}n i=1, unlabeled subset Du = {(xi)}m i=1, expected number of novel classes, a parameterized backbone fθ, linear classifier with weight W. 1: Pretrain the model parameters θ with pretext loss 2: for epoch = 1 to E do 3: u Estimate Uncertainty(Du) 4: for t = 1 to T do 5: Xl, Xu Sample Mini Batch(Dl Du) 6: Zl, Zu Forward(Xl Xu; fθ) 7: Z l, Z u Find Closest(Zl Zu) 8: Compute LP using (5) 9: Compute LS using (3) 10: Compute R using (6) 11: fθ SGD with loss LBCE + η1LCE + η2LR 12: end for 13: end for
Open Source Code Yes The code of ORCA is publicly available at https://github.com/snap-stanford/orca.
Open Datasets Yes We evaluate ORCA on four different datasets, including three standard benchmark image classification datasets CIFAR-10, CIFAR-100 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015), and a highly unbalanced single-cell Mouse Ageing Cell Atlas dataset from biology domain (Consortium et al., 2020).
Dataset Splits Yes On all datasets, we use controllable ratios of unlabeled data and novel classes. We first divide classes into 50% seen and 50% novel classes. We then select 50% of seen classes as the labeled dataset, and the rest as unlabeled set. We show results with different ratio of seen and novel classes and with 10% labeled samples in the Appendix C.
Hardware Specification Yes Our core algorithm is developed using Py Torch (Paszke et al., 2019) and we conduct all the experiments with NVIDIA RTX 2080 Ti.
Software Dependencies No The paper mentions "Py Torch (Paszke et al., 2019)", which cites the paper introducing PyTorch, but does not provide a specific numerical version number for the PyTorch library or any other software dependency (e.g., "PyTorch 1.9").
Experiment Setup Yes We train the model using standard SGD with a momentum of 0.9 and a weight decay of 5 10 4. The model is trained for 200 epochs with a batch size of 512. We anneal the learning rate by a factor of 10 at epoch 140 and 180. ... We set hyperparameters to the following default values: s = 10, λ = 1, η1 = 1, η2 = 1. ... We use Adam optimizer with an initial learning rate of 10 3 and a weight decay 0. The model is trained with a batch size of 512 for 20 epochs.