reproducibilityindex.ai

Doubly-Robust Self-Training

Authors: Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through empirical evaluations on both the ImageNet dataset for image classification and the nuScenes autonomous driving dataset for 3D object detection, we demonstrate the superiority of the doubly robust loss over the standard self-training baseline.
Researcher Affiliation	Academia	Banghua Zhu Department of EECS UC Berkeley banghua@berkeley.edu Mingyu Ding Department of EECS UC Berkeley myding@berkeley.edu Philip Jacobson Department of EECS UC Berkeley philip_jacobson@berkeley.edu Ming Wu Department of EECS UC Berkeley mingwu@berkeley.edu Wei Zhan Department of EECS UC Berkeley wzhan@berkeley.edu Michael I. Jordan Department of EECS UC Berkeley jordan@berkeley.edu Jiantao Jiao Department of EECS UC Berkeley jiantao@berkeley.edu
Pseudocode	No	The paper provides mathematical formulations of loss functions but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available in https://github.com/dingmyu/Doubly-Robust-Self-Training.
Open Datasets	Yes	We evaluate our doubly robust self-training method on the ImageNet100 dataset, which contains a random subset of 100 classes from ImageNet-1k (Russakovsky et al., 2015)... We conduct experiments on both image classification task with ImageNet dataset... and 3D object detection task with autonomous driving dataset nuScenes (Caesar et al., 2020).
Dataset Splits	Yes	We first conduct experiments on ImageNet100 by training the model for 20 epochs using different fractions of labeled data from 1% to 100%. ... We report results for training with 1/24, 1/16, and 1/4 of the total labels in Table 2.
Hardware Specification	Yes	We train all the models with a batch size of 1024 on 8 Tesla V100 GPUs... The teacher pre-training and student training are both conducted for 10 epochs on 3 NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The paper mentions using "Adam W" optimizer and a "triangular learning rate schedule" but does not specify software names with version numbers for reproducibility (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	We train all the models with a batch size of 1024... All models are trained for 20 epochs... The weight decay is set to 0.05 and the maximal gradient norm is clipped to 1.0. The stochastic depth drop rates are set to 0.1 for all models.