reproducibilityindex.ai

Hardware Resilience Properties of Text-Guided Image Classifiers

Authors: Syed Talal Wasim, Kabila Haile Soboka, Abdulrahman Mahmoud, Salman H. Khan, David Brooks, Gu-Yeon Wei

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the combination of textual and visual information can improve the reliability of a neural network s classification layer by up to 14 compared to traditional error detection and correction techniques, with minimal changes to pre-existing training recipes and their corresponding training accuracy.
Researcher Affiliation	Academia	Syed Talal Wasim Mohamed bin Zayed University of AI Kabila Haile Soboka Independent Abdulrahman Mahmoud Harvard University Salman Khan Mohamed bin Zayed University of AI David Brooks Harvard University Gu-Yeon Wei Harvard University
Pseudocode	No	The paper describes the proposed architecture and methodology in text and diagrams (Figure 1), but does not provide any pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and models are released at https://github.com/Talal Wasim/Text Guided Resilience.
Open Datasets	Yes	For each backbone, we train a baseline model (the default architecture), and our modified classification model on the Image Net training set. For both methods, we follow the standard Py Torch [47] training recipe. We then report accuracy and resilience on the Image Net validation set and compare across multiple architecture families. We evaluate our method on additional datasets (CIFAR10 [29], CIFAR100 [30], Food101 [3], and STL10 [9]) for two networks: Res Net-50 [20] and Focal Net-T [65].
Dataset Splits	Yes	We then report accuracy and resilience on the Image Net validation set and compare across multiple architecture families. The tables present results across various datasets (CIFAR10 [29], CIFAR100 [30], FOOD101 [3], and STL10 [9]) for two backbones (Res Net-50 [20] and Focal Net-T [65]), reporting top1 accuracy on the respective validation set for both the baseline and our method.
Hardware Specification	Yes	We train our models on 4 A100 GPUs and run the training routine for both the original and our modified models to the same number of epochs and with the same hyperparameters as described in the Py Torch Github repository 1 for a fair comparison. For the ablation study presented in Table 3, we train each model on 8 V100 GPUs.
Software Dependencies	No	The paper mentions using 'Py Torch [47]' and models like GPT-3 and CLIP, but it does not specify version numbers for Py Torch or other software dependencies required for reproduction.
Experiment Setup	Yes	We train our models on 4 A100 GPUs and run the training routine for both the original and our modified models to the same number of epochs and with the same hyperparameters as described in the Py Torch Github repository 1 for a fair comparison. Our results are presented in 6, in Table 1. For the ablation study presented in Table 3, we train each model on 8 V100 GPUs. We used the same set of hyperparameters for both models (detailed hyperparameters are reported in Appendix A).