TransBoost: Improving the Best ImageNet Performance using Deep Transduction
Authors: Omer Belhasin, Guy Bar-Shalom, Ran El-Yaniv
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method significantly improves the Image Net classification performance on a wide range of architectures, such as Res Nets, Mobile Net V3-L, Efficient Net B0, Vi T-S, and Conv Next-T, leading to state-of-the-art transductive performance. Additionally we show that Trans Boost is effective on a wide variety of image classification datasets. |
| Researcher Affiliation | Collaboration | Omer Belhasin Department of Computer Science Technion Israel Institute of Technology omer.be@cs.technion.ac.il Guy Bar-Shalom Department of Computer Science Technion Israel Institute of Technology guy.b@cs.technion.ac.il Ran El-Yaniv Department of Computer Science Technion Israel Institute of Technology, Deci.AI rani@cs.technion.ac.il |
| Pseudocode | Yes | Algorithm 1: Trans Boost Procedure |
| Open Source Code | Yes | The implementation of Trans Boost is provided at: https://github.com/omerb01/Trans Boost. |
| Open Datasets | Yes | Most of our study of Trans Boost is done using the well-known Image Net-1k ILSVRC-2012 dataset [19], which contains 1,281,167 training instances and 50,000 test instances in 1,000 categories. Additionally, we investigated the effectiveness of Trans Boost on the Food-101 [21], CIFAR-10 and CIFAR-100 [22], SUN-397 [23], Stanford Cars [24], FGVC Aircraft [25], the Describable Textures Dataset (DTD) [26] and Oxford 102 Flowers [27]. |
| Dataset Splits | Yes | Our hyperparameters were fine-tuned based on a validation set that was sampled from the training set over Image Net. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, or specific computing infrastructure) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions software like PyTorch and the Timm repository but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We used the SGD optimizer (Nesterov) with a momentum of 0.9, weight decay of 10 4, and a batch size of 1024 (consisting of 512 labeled training instances and 512 unlabeled test instances). Unless otherwise specified, the learning rate was fixed to 10 3 with no warmup for 120 epochs. The regularization hyperparameter of our loss in all our experiments was fixed to λ = 2; see Equation (7). |