SemiReward: A General Reward Model for Semi-supervised Learning

Authors: Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu, Cheng Tan, Stan Z. Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental extensive experiments verify that Semi Reward achieves significant performance gains and faster convergence speeds upon Pseudo Label, Flex Match, and Free/Soft Match.
Researcher Affiliation Collaboration Siyuan Li1,2 Weiyang Jin2 Zedong Wang2 Fang Wu2 Zicheng Liu1,2 Cheng Tan1,2 Stan Z. Li2 AI Lab, Research Center for Industries of the Future, Hangzhou, China; 1Zhejiang University, College of Computer Science and Technology; 2Westlake University
Pseudocode Yes Algorithm 1 Pseudocode of Semi Reward training and inference in a Py Torch-like style.
Open Source Code Yes Code and models are available at https://github.com/Westlake-AI/SemiReward.
Open Datasets Yes For CV tasks, our investigations featured the deployment of renowned and challenging datasets, including CIFAR-100 (Krizhevsky et al., 2009), STL-10 (Coates et al., 2011), Euro SAT (Helber et al., 2019), and Image Net (Deng et al., 2009)... (further details in Tables A1 and A2 with citations)
Dataset Splits Yes Table A1: Settings and details classification datasets in various modalities. Domain Dataset #Label per class #Training data #Validation data #Test data #Class
Hardware Specification Yes All experiments are implemented with Py Torch and run on NVIDIA A100 GPUs, using 4GPUs training by default.
Software Dependencies No All experiments are implemented with Py Torch and run on NVIDIA A100 GPUs, using 4GPUs training by default. No specific version number for PyTorch or other libraries is provided.
Experiment Setup Yes Table A3: Hyper-parameters and training schemes of SSL classification tasks based on USB. and Table A4: Hyper-parameters and training schemes of Semi Reward for various tasks and modalities. These tables specify learning rates, batch sizes, optimizers, schedulers, etc.