reproducibilityindex.ai

Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis

Authors: Zhu Wang, Sourav Medya, Sathya Ravi

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that it is possible to design models that perform similarly to state-of-the-art results but with significantly fewer samples and less training time. Our models and code are available here: https://github.com/ellenzhuwang/implicit_vkood
Researcher Affiliation	Academia	Zhu Wang Sourav Medya Sathya N. Ravi Department of Computer Science, University of Illinois at Chicago {zwang260,medya,sathya}@uic.edu
Pseudocode	Yes	Algorithm 1 Fixed Point Network Operator based OOD Detection Layer for Language Features lj
Open Source Code	Yes	Our models and code are available here: https://github.com/ellenzhuwang/implicit_vkood
Open Datasets	Yes	We pre-trained on three datasets, including COCO [35], Visual Genome [28], and SBU Captions [47] with total of 1M images and 6.8M image-caption pairs, as approximate 30% less than the baseline(Vi LT).
Dataset Splits	No	The paper mentions datasets used for training, fine-tuning, and testing (e.g., VQAv2 test set, COCO val dataset), implying standard splits for these benchmarks. However, it does not explicitly provide specific percentages or counts for training/validation/test splits for full reproducibility, stating 'Following standard practice in Vision' for training strategies.
Hardware Specification	Yes	We pre-trained and fine-tuned both on 8 NVIDIA RTX 2080Ti GPUs, and for inference we used 1 NVIDIA RTX 2080Ti GPU.
Software Dependencies	No	The paper mentions various software components and models (e.g., RoBERTa, ViT-B/32, CLIP, BLIP, BERT-base, AdamW optimizer) but does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	Network training. We pre-trained the model for 10 epochs using Adam W optimizer [39] with learning rate of 1e 4 and weight decay of 1e 2. We chose the warm-up phase of learning rate to be 10% of the total training steps, and the learning rate was decayed linearly to 0 afterwards. Then, we fine-tuned our model for 5 epochs with learning rate of 2e 4 for all downstream tasks. In addition, we applied Rand Augment [12] as augmentation strategy in fine-tuning steps.