Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deliberative Explanations: visualizing network insecurities
Authors: Pei Wang, Nuno Nvasconcelos
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we discuss experiments performed to evaluate the quality of deliberative explanations. |
| Researcher Affiliation | Academia | Pei Wang and Nuno Vasconcelos Department of Electrical and Computer Engineering University of California, San Diego EMAIL |
| Pseudocode | No | The paper does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that the code for the methodology is available. |
| Open Datasets | Yes | Experiments were performed on the CUB200 [48] and ADE20K [53] datasets. |
| Dataset Splits | Yes | We assume a training set D of N i.i.d. samples D = {(xi, yi)}N i=1, where yi is the label of image xi, and a test set T = {(xj, yj)}M j=1. Test set labels are only used to evaluate performance. ... All results are presented on the standard CUB200 test set and the of๏ฌcial validation set of ADE20K. |
| Hardware Specification | No | The paper mentions network architectures (VGG16, ResNet50, AlexNet) but provides no specific details about the hardware (GPU, CPU models, memory) used for experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | All experiments used candidate class sets C(i, j) of 3 members and among top 5 predictions, and were ran three times. ... For each image, T is chosen so that insecurities cover from 1% to 90% of the image, with steps of 1%. |